Quality assessment is likely to focus on the following elements of the economic evaluation, each of which can have an important impact on the validity of the overall results of that study.
Methods of deriving the effectiveness data
Measurement of resource data
Valuation of resource data
Measurement and valuation of health benefits (utilities)
Method of synthesising the costs and effects
Analysis of uncertainty
Generalisability of the results
This is not an exhaustive list, but an understanding of these issues, which are discussed in more detail below, will provide insight into the quality assessment of economic evaluations. Quality assessment of decision models is not covered in detail here due to the technical nature of the material. It is recommended that more detailed information on good practice in decision modelling be consulted.15
There is a hierarchy of sources of evidence ranging from a formal systematic review to expert opinion and authors’ assumptions.16 Where possible economic evaluations should use effectiveness data obtained from a systematic review. However, non-systematic synthesis of effectiveness data may be justifiable when it is the only available source of evidence.
The type of effectiveness data included in an economic evaluation can vary from a single efficacy parameter obtained from a meta-analysis of RCTs to epidemiological data mapping the natural history of disease. Quality assessment of the clinical effectiveness data incorporated in an economic evaluation will depend on the type of clinical data used; whether the data were obtained from a single study or from the literature or from expert opinion; and whether modelling techniques were used.
When the effectiveness data has been derived from a single study, quality assessment should be undertaken as described in Chapter 1. However, additional elements will also need to be assessed. For example, whether the study time horizon is adequate to capture all the relevant health outcomes required and, if statistical modelling techniques have been used to extrapolate the data, whether the extrapolation methods and assumptions used were appropriate.17
When the effectiveness data has been synthesised from a variety of sources assessment should focus on the quality of the literature review and the methods used to synthesise the data including:
Whether a search strategy was used
Which databases were searched
Whether there were clear inclusion and exclusion criteria
Whether sufficient information was given about the quality of the included studies
Quality assessment of cost analysis should consider which costs were evaluated in the study, the measurement of the associated resource quantities, and the valuation (cost) of those resources. Some of the issues that need to be assessed are common to all economic evaluations, while others are specific to the type of approach used.
For any economic evaluation all costs relevant to the study question and the perspective adopted or viewpoint from which the analysis has been undertaken should have been included. For example patient travel costs are a cost from the patient’s perspective and a cost from society’s perspective, but not a cost from the hospital’s perspective.
Measurement of resources data
Resource use is measured in physical units such as equipment, staff, dressings and drugs. Issues to consider are as follows:
The sources used to collect resource utilisation data should be reported clearly (e.g. clinical trials, administrative databases, clinical databases, medical records and published literature)
Resource quantities should be reported independently from the costs, so that assessment of the measurement method is facilitated
Any assumptions in the measurement of resources should be explicitly reported and justified
If an expert was consulted to estimate some of the resources, the methods used should be described
For trial-based economic evaluations, the most valid resource estimates are considered to be those collected prospectively alongside effectiveness data, utilising the robust infrastructure established for the trial.18
If resources utilized were identified through a review of the literature, details of the process employed to identify and select the patterns of resource utilisation and the quantities used should have been given.
Valuation of resource data
For the valuation of resources, the relevant issues to consider are as follows:
All the sources used to obtain unit costs should be reported and be relevant for the specific study setting
All costs should be adjusted to a specific price year so that the effects of inflation are removed from the cost estimation
If the time horizon for estimating costs was longer than one year, discounting should have been performed in order to reflect time preferences19
If prices were used instead of costs and cost-to-charge ratios calculated these should reflect the true opportunity costs of the strategies compared20
Utilities may be measured using either a generic valuation tool, such as the SF-6D or the EQ-5D, or a disease specific tool which may have been obtained using either standard gamble or time trade off techniques. Tools differ considerably (a full discussion is given in the books by Drummond12 and Brazier21) and choice of tool can impact on the results obtained and on their usefulness in priority setting. As a minimum assessment should consider who provided the scores (patients, clinicians, general public, etc.), which tool was used (EQ-5D, SF-6D, etc.) and when the scores were elicited (at baseline, during treatment, after treatment, etc.). A useful overview and comparison of the impact of different measures in rheumatoid arthritis is available.22
The true economic value of an intervention compared to another depends on the additional costs and benefits. Incremental cost-effectiveness ratios are the ratios that capture this relative value. Unless a treatment is clearly dominant (both cheaper and more effective), incremental cost-effectiveness ratios (ICERs) should have been calculated as this is the only appropriate way of capturing the true economic value.12 A paper should report sufficient data to ascertain dominance from the figures given, rather than relying on a statement from the authors which can be made in error and be potentially misleading. Cost-effectiveness results should have been reported in both a disaggregated and an aggregated way. That is, undiscounted and discounted health benefit and cost results should have been reported both separately and as part of the ICERs. It is also appropriate to report the net benefit statistic, which is sometimes used to overcome the statistical issues raised when dealing with a ratio, like the ICER.
A well-conducted economic evaluation should investigate as thoroughly as possible, the following sources of uncertainty:
Parameter uncertainty, which occurs because parameters are estimated from samples and their true value is unknown
Methodological uncertainty, which arises from the analytical methods used in the evaluation, particularly where there is disagreement around the methods used (e.g. the inclusion of indirect costs, discounting of health benefits, discount rate)
Modelling uncertainty which can arise due to the simplifying assumptions that are often required to facilitate modelling
Methods of evaluating uncertainty include statistical comparisons, bootstrapping, sensitivity analyses (one-way or multi-way sensitivity analyses, threshold analyses and analyses of extremes or worst/best case analysis) and probabilistic sensitivity analyses. The method(s) employed will vary depending on what is being assessed and the types of data that were used as input parameters in the economic evaluation.
Statistical tests comparing effects, costs or cost-effectiveness are appropriate for studies that have derived their effectiveness and costs from patient level data. The quality assessment of the statistical comparisons performed should focus on the appropriateness of the type of tests used and the results reported (e.g. 95% confidence intervals; p-values).
Bootstrapping is a statistical method that can be applied to capture uncertainty where patient level data are used.23 Due to the fact that the ICER is a ratio, normal parametric statistical methods based on the standard error cannot be used. Non-parametric bootstrapping is an alternative method which allows a comparison of the arithmetic means without making any assumptions about the sampling distribution. However, it should be noted that economic evaluations can use a net benefit statistic rather than an ICER to overcome the statistical problems associated with a ratio.24
Sensitivity analyses of parameter uncertainty are usual in economic evaluations that obtain their data from systematic or other reviews. The aim of the sensitivity analyses is to evaluate the sensitivity of the results to changes in the parameter estimates. N-way sensitivity analyses and threshold analysis can only vary a few parameters at the same time in practice. In contrast, probabilistic sensitivity analysis (PSA) (see below) can vary all parameters at the same time, subject to data availability.
The following issues should be assessed:
Whether the parameters chosen were justified
Whether variations were performed across meaningful ranges of values
Whether the robustness of the results was assessed according to a previously agreed level of ‘acceptable variation’
Uncertainty around analytical methods is also assessed through the use of sensitivity analysis. For example, the impact of different discount rates and the use of discounting (or not) on health benefits should have been assessed in studies with a long time horizon.
Box 5.3: Checklist for assessing economic evaluations Study design 1. Was the research question stated? 2. Was the economic importance of the research question stated? 3. Was/were the viewpoint(s) of the analysis clearly stated and justified? 4. Was a rationale reported for the choice of the alternative programmes or interventions compared? 5. Were the alternatives being compared clearly described? 6. Was the form of economic evaluation stated? 7. Was the choice of form of economic evaluation justified in relation to the questions addressed? Data collection 8. Was/were the source(s) of effectiveness estimates used stated? 9. Were details of the design and results of the effectiveness study given (if based on a single study)? 10. Were details of the methods of synthesis or meta-analysis of estimates given (if based on an overview of a number of effectiveness studies)? 11. Were the primary outcome measure(s) for the economic evaluation clearly stated? 12. Were the methods used to value health states and other benefits stated? 13. Were the details of the subjects from whom valuations were obtained given? 14. Were productivity changes (if included) reported separately? 15. Was the relevance of productivity changes to the study question discussed? 16. Were quantities of resources reported separately from their unit cost? 17. Were the methods for the estimation of quantities and unit costs described? 18. Were currency and price data recorded? 19. Were details of price adjustments for inflation or currency conversion given? 20. Were details of any model used given? 21. Was there a justification for the choice of model used and the key parameters on which it was based? Analysis and interpretation of results 22. Was time horizon of cost and benefits stated? 23. Was the discount rate stated? 24. Was the choice of rate justified? 25. Was an explanation given if cost or benefits were not discounted? 26. Were the details of statistical test(s) and confidence intervals given for stochastic data? 27. Was the approach to sensitivity analysis described? 28. Was the choice of variables for sensitivity analysis justified? 29. Were the ranges over which the parameters were varied stated? 30. Were relevant alternatives compared? (i.e. Were appropriate comparisons made when conducting the incremental analysis?) 31. Was an incremental analysis reported? 32. Were major outcomes presented in a disaggregated as well as aggregated form? 33. Was the answer to the study question given? 34. Did conclusions follow from the data reported? 35. Were conclusions accompanied by the appropriate caveats? 36. Were generalisability issues addressed? |
Based on Drummond's checklist27
This method can only be used to deal with parameter uncertainty in modelling-based economic evaluations. PSA, also referred to as second-order uncertainty, considers the uncertainty surrounding the value of a parameter. This is achieved by assigning a probability distribution rather than a point estimate to each parameter. The quality assessment in this case should focus on whether:
Appropriate distributions were assigned to the model parameters6, 25
Relevant assumptions were tested. For example, assumptions about model structure or interpretation of the available evidence12
Generalisability refers to the extent to which the results obtained can be applied to different settings. The relevance of the intervention, the patient population and the resources which have been included in the economic evaluation will determine whether the results can be generalised. Uncertainty regarding the generalisability of the results to the relevant study setting would usually be assessed through sensitivity analyses. A useful discussion on this issue is available.26
Several reliable, comprehensive, and easy to use checklists are available to guide the quality assessment of economic evaluations. The most widely used is the BMJ checklist.27 Both a 10-item version and an expanded 35-item version are available. In addition, a 36th item relating to generalisability may be added if it is relevant to the review (see Box 5.3). Although, this checklist does not provide detailed coverage of some issues relevant to modelling studies, it can be augmented using specific items such as model type, structural assumptions, time horizon, cycle length and health states. Alternatively, a checklist developed to assess the quality of the models used in economic evaluations can be used as a complement to the BMJ checklist.15
In some cases the validity of an economic evaluation may be difficult to assess due to limitations in reporting, an issue common to many studies and covered in Chapter 1.
Several quality scoring systems have been devised for use in assessing the methodological quality of economic evaluations. These are generally based on completing checklists, assigning values to the different items considered, and summing these values to obtain a final score, which is intended to reflect the quality level of the appraised study.
Six published quality scoring systems for economic evaluations have been identified, but none of these are considered to be sufficiently valid and reliable for use as a method of quality assessment.28 Given the limitations presented by quality scoring systems, their use is not recommended. Rather, it is preferable to present a checklist or a descriptive critical assessment based on appropriate guidelines or checklists, which should describe the methods and results, strengths and weaknesses and the implications of the strengths and weaknesses on the reliability of the conclusions.