The review protocol sets out the methods to be used in the review. Decisions about the review question, inclusion criteria, search strategy, study selection, data extraction, quality assessment, data synthesis and plans for dissemination should be addressed. Specifying the methods in advance reduces the risk of introducing bias into the review. For example, clear inclusion criteria avoids selecting studies according to whether their results reflect a favoured conclusion.
If modifications to the protocol are required, these should be clearly documented and justified. Modifications may arise from a clearer understanding of the review question, and should not be made because of an awareness of the results of individual studies. Further information is given in Section 1.2.4 How to deal with protocol amendments during the review.
Protocol development is often an iterative process that requires communication within the review team and advisory group and sometimes with the funder.
This section covers the development of the protocol and the information it should contain. The formulation of the review objectives from the review question and the setting of inclusion criteria are covered in detail here as these must be agreed before starting a review. The search strategy, study selection, data extraction, quality assessment, synthesis and dissemination are also mentioned briefly as they are essential parts of the review protocol. However, to avoid repetition, full details of the issues related to both protocol requirements and carrying out the review are provided in Section 1.3 Undertaking the review.
The background section should communicate the key contextual factors and conceptual issues relevant to the review question. It should explain why the review is required and provide the rationale underpinning the inclusion criteria and the focus of the review question, for example justifying the choice of interventions to be considered in the review.
Systematic reviews should set clear questions, the answers to which will provide meaningful information that can be used to guide decision-making. These should be stated clearly and precisely in the protocol. Questions may be extremely specific or very broad, although if broad, it may be more appropriate to break this down into a series of related more specific questions. For example a review to ‘assess the evidence on the positive and negative effects of population-wide drinking water fluoridation strategies to prevent caries’,13 was undertaken by addressing five objectives:
Objective 1: What are the effects of fluoridation of drinking water supplies on the incidence of caries?
Objective 2: If water fluoridation is shown to have beneficial effects, what is the effect over and above that offered by the use of alternative interventions and strategies?
Objective 3: Does water fluoridation result in a reduction of caries across social groups and between geographical locations, bringing equity?
Objective 4: Does water fluoridation have negative effects?
Objective 5: Are there differences in the effects of natural and artificial water fluoridation?
Where there are several objectives it may be necessary to prioritise by importance and likelihood of being able to answer the question. It may even be necessary to restrict the scope of the question to a level that is manageable within set resources. For clarity, the singular term ‘review question’ is used throughout the guidance.
The review question can be framed in terms of the population, intervention(s), comparator(s) and outcomes of the studies that will be included in the review. These elements of the review question, together with study design, will then be refined in order to determine the specific inclusion criteria that will be used when selecting studies for the review. Although both the acronyms PICO or PICOS are commonly used, here the term PICOS will be used throughout for consistency. In some situations, not all the elements will be relevant, for example not every review question will specify type of study design to be included. The use of PICOS in the context of reviews incorporating different study designs is discussed in the relevant chapters.
The review question may be presented in general terms, for example, ‘What is the best treatment option for retinoblastoma?’ More often the actual question is discussed by the review team and an objective, or series of objectives, framed by the population, the intervention and the outcome(s) of interest agreed. For example, ‘The objective of this review is to assess the clinical effectiveness of treatments for childhood retinoblastoma.’14 The PICOS elements for this example are shown in Box 1.2.
Box 1.2: Example review objective and PICOS elements for a review protocol
The objective of this review is to assess the clinical effectiveness of treatments for childhood retinoblastoma.14
Studies of participants diagnosed with retinoblastoma at the age of 18 years or under.
Studies of adults where childhood retinoblastoma was followed up into adulthood.
Studies of mixed diagnoses if outcomes were reported separately for children with retinoblastoma.
Any intervention or combination of interventions given for the treatment of retinoblastoma, including (but not restricted to) enucleation, external beam radiotherapy, chemotherapy, brachytherapy, cryotherapy, thermotherapy and photocoagulation.
Any clinical outcome, including (but not restricted to) survival, progression-free survival, tumour response, preservation of the eye, visual acuity, disease remission and adverse effects.
Randomised controlled trials (RCTs) and controlled trials. However, it is not anticipated that many studies of these designs will be available. Therefore, if information from controlled trials is not available, cohort studies are eligible for inclusion provided that data from a comparison group are reported.
Case series and case reports
are excluded from the review owing to the high potential for bias in these
study designs. Case–control studies (except where nested as part of a
cohort study) and economic evaluations are also excluded.
The included population should be relevant to the population to which the review findings will be applied, and explicit inclusion criteria should be defined in terms of the disease or condition of interest. Any specified restrictions should be clinically justifiable and relevant. Eligibility must usually be applied to the whole study and consideration of how to deal with studies that include a mixed population, some of whom are relevant to the review and some of whom are not, is required. If the inclusion criteria are broad, it may be informative to investigate effectiveness across subgroups of participants. However, in the absence of individual patient data (IPD), or very detailed reporting of data broken down by participant characteristics, it is unlikely that inclusion can be restricted to particular types of participant or that detailed subgroup analyses will be possible. Where analysis of participant subgroups is planned, this should be specified in the protocol. Examples of factors that may be investigated include participants’ gender, age, disease severity, the presence of any co-morbidities, socio-economic status, ethnicity and geographical area.
Interventions and comparators
The nature of the interventions explored in the review may be framed in very broad terms like ‘psychosocial interventions’ or may be more specific such as ‘cognitive behavioural therapy’. Factors usually specified include the precise nature of the intervention (e.g. the method of administration of a drug), the person delivering the intervention (e.g. a community psychiatric nurse versus a non-professional carer) or setting in which the intervention is delivered (e.g. inpatient or outpatient).
Where comparative studies are to be included, the protocol should also specify which comparators are eligible. As with the interventions, comparators should be carefully defined, so that the scope of a term such as ‘palliative care’ or ‘usual care’ is clear. The protocol should also specify whether any co-interventions carried out at the same time affect eligibility for inclusion; this applies to both the intervention(s) and the comparator(s).
The success or failure of a therapeutic intervention will usually be assessed in terms of differences in mortality or morbidity in the populations treated. Primary outcomes are likely to include measures of mortality and morbidity but other outcomes may also be of importance, for example measures of quality of life and participants’ subjective experiences of pain or physical functioning.
A review should explore a clearly defined set of relevant outcomes and it is important to justify each outcome included. Input from the advisory group and the findings from initial scoping searches and qualitative research may be helpful in deciding which outcomes to include.
The use of surrogate outcomes may be misleading, giving an over or underestimate of the true clinical outcome.15 Decisions about whether to consider surrogate outcomes should therefore be informed by available evidence about associations between the surrogate (e.g. blood pressure) and the outcome of interest (e.g. stroke). Often, surrogate outcomes are included only where a study also reports a relevant clinical outcome.
The review may also consider the timing of outcome assessment and possible adverse effects of the intervention. If the review is considering cost-effectiveness or economic issues as well as clinical effectiveness, the relevant economic outcomes should also be specified.
Although the review may aim to consider a series of outcomes, it is rare that inclusion would be restricted to only those studies that report all the outcomes of interest. More usually inclusion criteria will require that included studies report the main outcome.
The types of study included in the review will play a major role in determining the reliability of the results and the validity of estimates of effect is linked to the study design. While some study designs are clearly more robust than others, this should not be the only factor in determining which types of study are eligible for inclusion.16
Scoping searches may reveal that there are likely to be only a limited number of relevant randomised studies. In this case researchers have the option of justifying a decision to limit study design, bearing in mind that the identification of gaps in the current evidence base may in itself be a significant finding of the review. Alternatively, they can include quasi-experimental or observational studies. For reviews in some topic areas, these may be the only types of study available. The study design inclusion criteria given as an example in Box 1.2 have been set to take account of the paucity of experimental studies, as indicated by the scoping searches.
In some cases a range of study designs may be needed to address different questions within the same review. For example, a review seeking to include information on adverse events will often include case-control and/or case-series (see Chapter 4) whilst a review incorporating participants’ experiences of an intervention is likely to include qualitative studies (see Chapter 6). The potential biases from the inclusion of a range of study designs are discussed in Section 1.3.4 Quality assessment.
The inclusion criteria should be set out in the protocol, to ensure that the boundaries of the review question are clearly defined. In the example in Box 1.2, the population to be studied was specified in the review question as those with ‘childhood’ retinoblastoma. In addition to qualifying ‘childhood’ as under 18, appropriate timeframes for disease progression and treatment and the possibilities of concurrent disease processes have been taken into account. In reviews of interventions relating to other diseases it may be necessary to be more specific about how the disease of interest will be verified, and to specify the disease stage and severity. In the simple example given in Box 1.2 the key interventions and outcomes of interest are listed.
The nature of the intervention(s) and comparator(s) should be specified in detail. Whilst this may be more straightforward for drug interventions, more complex interventions may require detailed consideration of terms. For example, interventions such as 'stress management' or 'relaxation' may be defined differently by different study authors. Therefore researchers need to be clear about their own definitions and what elements are acceptable. An operational definition describing the content and delivery of the intervention will usually be helpful.
The inclusion criteria should capture all studies of interest. If the criteria are too narrowly defined there is a risk of missing potentially relevant studies and the generalisability of the results may be reduced. On the other hand, if the criteria are too broad the review may contain information which is hard to compare and synthesise.17, 18 Inclusion criteria also need to be practical to apply; if they are too detailed, screening may become overly complicated and time consuming.
As previously stated, a review should be based on the best quality evidence available (see Box 1.3). Whatever the study design(s) included, it should not be assumed that all studies of the same basic design (e.g. RCT) are equally well-conducted. The quality of the included studies should be formally assessed as this will impact on the reliability of the results and therefore on the conclusions drawn. Although quality assessment can sometimes be used to exclude studies that do not meet certain criteria, this is not standard practice and differential quality is more usually assessed at the synthesis stage through sensitivity analysis. For further information see Section 1.3.4 Quality assessment and Section 1.3.5 Data synthesis.
Box 1.3: Hierarchy of study designs to assess the effects of interventions
This list is not exhaustive, but covers the main study designs. Refer to the glossary for definitions of other study designs. Names and definitions may differ (e.g. randomised controlled trial is often called randomised clinical trial).
Randomised controlled trials
The simplest form of RCT is known as the parallel group trial which randomises eligible participants to two or more groups, treats according to assignment, and compares the groups with respect to outcomes of interest. Participants are allocated to groups using both randomisation (allocation involves the play of chance) and concealment (ensures that the intervention that will be allocated cannot be known in advance). There are different types of randomised study designs, such as:
Randomised cross-over trials
Where all participants receive all the interventions; for example in a two arm cross-over trial, one group receives intervention A before intervention B, and the other group receive intervention B before intervention A. It is the sequence of interventions that is randomised.
Cluster randomised trials
A cluster randomised trial is a trial where clusters of people rather than single individuals are randomised to different interventions. For example, whole clinics or geographical locations may be randomised to receive particular interventions, rather than individuals.
The main distinction between randomised and quasi-experimental studies is the way in which participants are allocated to the intervention and control groups; quasi-experimental studies do not use random assignment to create the comparison groups.
Non-randomised controlled studies
Individuals are allocated to a concurrent comparison group, using methods other than randomisation. The lack of concealed randomised allocation increases the risk of selection bias.
Comparison of outcomes in study participants before and after the introduction of an intervention. The before-and-after comparisons may be in the same sample of participants or in different samples.
Interrupted time series
Interrupted time series designs are multiple observations over time that are ‘interrupted’, usually by an intervention or treatment.
A study in which natural variation in interventions or exposure among participants (i.e. not allocated by an investigator) is investigated to explore the effect of the interventions or exposure on health outcomes.
A defined group of participants is followed over time and comparison is made between those who did and did not receive an intervention.
Groups from the same population with (cases) and without (controls) a specific outcome of interest, are compared to evaluate the association between exposure to an intervention and the outcome.
Description of a number of cases
of an intervention and the outcome (without comparison with a control
group). These are not comparative studies.
The ideal for most systematic reviews is to include all available relevant evidence. In principle, this includes studies written in any language to avoid the introduction of language bias into the review. Language bias arises because studies with statistically significant results that have been conducted in non-English speaking countries may be more likely to be published in English language journals than those with nonsignificant results.19 In addition, trials originating in certain countries have been found to have unusually high proportions of positive results.20
Thus, if reviews include only studies reported in English, their results and inferences may be biased.19, 20, 21 Even if language bias does not influence summary effect estimates, it is likely to affect precision, because analysis will be based on fewer data.22 Whenever feasible, all relevant studies should be included regardless of language. However, realistically this is not always possible due to a lack of time, resources and facilities for translation. It is advisable therefore, to identify all non-English language papers, document their existence, but record ‘language’ as the reason for exclusion in cases where they cannot be dealt with. Although titles and abstracts are translated in many databases, full papers are usually only available in their primary language.
When a decision is made to include non-English language studies, the review question should inform the decision about which languages are chosen, as studies of particular interventions and/or settings are more likely than others to be published in certain languages. An investigation on the inclusion of non-English language reports of RCTs in systematic reviews concluded that language restrictions do not appear to bias the estimates in reviews of conventional interventions, but may bias the results of complementary or alternative medicines.23 Researchers need to give careful thought as to whether imposing language restrictions may potentially bias the results of their individual review. When non-English language literature is included in a review, its influence on the estimation and precision of effect may be explored in a sensitivity analysis.
Studies are not always published as full papers in peer-reviewed journals; they may be published as reports, book chapters, conference abstracts, theses or they may be informally reported or remain unpublished. Ideally a review should aim to include all relevant studies, regardless of publication status, in order to avoid publication bias. Publication bias occurs when the publication of a study is influenced by its results, hence inclusion of only published studies may overestimate the intervention effect.24
There are practical issues that limit the inclusion of all studies regardless of publication type/status. Unpublished studies are likely to be harder to source, and more difficult to obtain, than published studies. The inclusion of conference abstracts and interim results should be considered, bearing in mind that contact with the study authors may be required to obtain full study details.25 The effects of including any data from abstracts alone should be carefully considered, since differences often occur between data reported in conference abstracts and their corresponding full reports, although differences in results are seldom large.26, 27 Also, it can be difficult to appraise study quality from minimal details provided in an abstract. Sensitivity analyses may be carried out to examine the effect of including data from conference abstracts.28
The identification of ongoing studies is important for a number of reasons. They may provide a useful starting point for subsequent reviews and updates; they may also improve the quality of conclusions about future research by indicating where new research has already commenced. Information about ongoing studies may be available as ‘partially published research’ like conference abstracts – these can be classified as ongoing studies which may contribute to future reviews.29
A preliminary search strategy for identifying relevant research should be included in the protocol. This should specify the databases and additional sources that will be searched, and also the likely search terms to be used. The search strategy should be constructed to take into account PICOS, although the outcome(s) of studies and/or study design are not always used. Incorporating decisions about publication status and language restrictions also needs to be made at this stage. In reviews of one year or more duration, or reviews in rapidly evolving fields, provision for repeating the searches towards the end of the review process should also be considered. In addition it may be useful to carry out current awareness searches to identify relevant papers as they are published. The approach taken will depend on the question and the topic, and also on the available time and resources. It is usual to include in the protocol details of the software that will be used to manage references. Further information is given in Section 1.3.1 Identifying research evidence for systematic reviews.
Study selection is usually conducted in two stages: an initial screening of titles and abstracts against the inclusion criteria to identify potentially relevant papers followed by screening of the full papers identified as possibly relevant in the initial screening. The protocol should specify the process by which decisions on the selection of studies will be made. This should include the number of researchers who will screen titles and abstracts and then full papers, and the method for resolving disagreements about study eligibility. Section 1.3.2 Study selection contains more information.
The protocol should outline the information that will be extracted from studies identified for inclusion in the review and provide details of any software to be used for recording the data. As with study selection the protocol should state the procedure for data extraction including the number of researchers who will extract the data and how discrepancies will be resolved. The protocol should also specify whether authors of primary studies will be contacted to provide missing or additional data. If foreign language papers are to be included, it may be necessary to specify translation arrangements. Further information is given in Section 1.3.3 Data extraction.
The protocol should provide details of the method of study appraisal to be used, including examples of the specific quality criteria. Details of how the study appraisal is to be used should be specified, for example whether the results will inform sensitivity analyses. The protocol should also specify the process for conducting the appraisal of study quality, the number of researchers involved, and how disagreements will be resolved. For a detailed discussion of these issues see Section 1.3.4 Quality assessment.
As far as possible, the protocol should specify the strategy for data synthesis. It should state whether a meta-analysis is planned, although whether a planned meta-analysis will ultimately prove possible will depend on the studies and data that are available. As analyses will depend on what data are available, and because it is difficult to anticipate all of the statistical issues that may arise, it can be difficult to pre-specify full details of the planned synthesis. However, the protocol should outline how heterogeneity will be explored and quantified, under what circumstances a meta-analysis would be considered appropriate and whether a fixed or random-effects model or both would be used. Where appropriate, the approach to narrative synthesis should also be outlined. The protocol should also specify the outcomes of interest and what effect measures will be used. Any planned subgroup or sensitivity analyses or investigation of publication bias should also be described. Further information is given in Section 1.3.5 Data synthesis.
Dissemination of findings is an integral part of the review process and fundamental to ensuring that the essential messages from the review reach the appropriate audiences. It is helpful to consider how the review findings will be disseminated from as early a stage as possible to allow adequate time for planning and development and to ensure that the proposed activities are properly resourced. Details are given in Section 1.3.8 Disseminating the findings of systematic reviews.
Some commissioning or funding bodies may require that they formally approve the protocol, and will provide input to the draft protocol, in addition to the other stakeholders, such as clinical and methodological experts, patient groups and service users, who may be consulted. For commissioned reviews, even where it is not a specific requirement, it can be useful to communicate with the commissioner at the protocol development stage. This will help to ensure that the protocol meets the commissioning brief or where the review question or the scope of the project has been altered, that this is agreed before work commences.
Sticking rigidly to a protocol when it becomes apparent that a change of direction is required, can result in a review that is not useful to end users. It is possible that consideration of the primary research may raise questions which were not anticipated at the protocol stage. Where this results from a clearer understanding of the review question, it can be appropriate to carry out documented and justified amendments to the protocol. In the report of the review findings it is helpful to distinguish between the initial review question and any subsequent amendments. It is never appropriate to modify the protocol because of awareness of the results of individual studies, as this is likely to introduce bias and affect the validity of the review’s conclusions.
Many reviews undergo protocol modification.30 Where modifications are a possibility, the implications for the review process and workload should be considered carefully. In particular, the likely impact on the literature search should be assessed as it may require modification and running again. Data extraction forms may also need to be amended, and any data that have already been extracted might require some re-working. Protocol amendments should be documented in a protocol addendum and in the final report of the review.
Summary: The review protocol