Comparative Effectiveness Review Number 157 Diagnosis of Right Lower Quadrant Pain and Suspected Acute Appendicitis Comparative Effectiveness Review Number 157 Diagnosis of Right Lower Quadrant Pain and Suspected Acute Appendicitis Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 5600 Fishers Lane Rockville, MD 20857 www.ahrq.gov Contract No. 290-2012-00012-I Prepared by: Brown Evidence-based Practice Center Providence, RI Investigators: Issa J. Dahabreh, M.D., M.S. Gaelen P. Adam, M.L.I.S. Christopher W. Halladay, Sc.M. Dale W. Steele, M.D., M.S. Lori A. Daiello, Pharm.D., Sc.M. L. Susan Wieland, M.P.H., Ph.D. Anja Zgodic, Sc.M. Bryant T. Smith, M.P.H., C.P.H. Thaddeus W. Herliczek, M.D., M.S. Nishit Shah, M.D. Thomas A. Trikalinos, M.D. AHRQ Publication No. 15(16)-EHC025-EF December 2015 This report is based on research conducted by the Brown Evidence-based Practice Center (EPC) under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. 290-2012-00012-I). The findings and conclusions in this document are those of the authors, who are responsible for its contents; the findings and conclusions do not necessarily represent the views of AHRQ. Therefore, no statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services. None of the investigators have any affiliations or financial involvement that conflicts with the material presented in this report. The information in this report is intended to help health care decisionmakers—patients and clinicians, health system leaders, and policymakers, among others—make well-informed decisions and thereby improve the quality of health care services. This report is not intended to be a substitute for the application of clinical judgment. Anyone who makes decisions concerning the provision of clinical care should consider this report in the same way as any medical reference and in conjunction with all other pertinent information, i.e., in the context of available resources and circumstances presented by individual patients. This report is made available to the public under the terms of a licensing agreement between the author and the Agency for Healthcare Research and Quality. This report may be used and reprinted without permission except those copyrighted materials that are clearly noted in the report. Further reproduction of those copyrighted materials is prohibited without the express permission of copyright holders. AHRQ or U.S. Department of Health and Human Services endorsement of any derivative products that may be developed from this report, such as clinical practice guidelines, other quality enhancement tools, or reimbursement or coverage policies, may not be stated or implied. This report may periodically be assessed for the currency of conclusions. If an assessment is done, the resulting surveillance report describing the methodology and findings will be found on the Effective Health Care Program Web site at www.effectivehealthcare.ahrq.gov. Search on the title of the report. Persons using assistive technology may not be able to fully access information in this report. For assistance contact [email protected]. Suggested citation: Dahabreh IJ, Adam GP, Halladay CW, Steele DW, Daiello LA, Weiland LS, Zgodic A, Smith BT, Herliczek TW, Shah N, Trikalinos TA. Diagnosis of Right Lower Quadrant Pain and Suspected Acute Appendicitis. Comparative Effectiveness Review No. 157. (Prepared by the Brown Evidence-based Practice Center under Contract No. 290-2012-00012-I.) AHRQ Publication No. 15(16)-EHC025-EF. Rockville, MD: Agency for Healthcare Research and Quality; December 2015. www.effectivehealthcare.ahrq.gov/reports/final.cfm. ii Preface The Agency for Healthcare Research and Quality (AHRQ), through its Evidence-based Practice Centers (EPCs), sponsors the development of systematic reviews to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. These reviews provide comprehensive, science-based information on common, costly medical conditions, and new health care technologies and strategies. Systematic reviews are the building blocks underlying evidence-based practice; they focus attention on the strength and limits of evidence from research studies about the effectiveness and safety of a clinical intervention. In the context of developing recommendations for practice, systematic reviews can help clarify whether assertions about the value of the intervention are based on strong evidence from clinical studies. For more information about AHRQ EPC systematic reviews, see www.effectivehealthcare.ahrq.gov/reference/purpose.cfm. AHRQ expects that these systematic reviews will be helpful to health plans, providers, purchasers, government programs, and the health care system as a whole. Transparency and stakeholder input are essential to the Effective Health Care Program. Please visit the Web site (www.effectivehealthcare.ahrq.gov) to see draft research questions and reports or to join an email list to learn about new program products and opportunities for input. If you have comments on this systematic review, they may be sent by mail to the Task Order Officer named below at: Agency for Healthcare Research and Quality, 5600 Fishers Lane, Rockville, MD 20857, or by email to [email protected]. Richard G. Kronick, Ph.D. Arlene S. Bierman, M.D., M.S. Director Director Agency for Healthcare Research and Quality Center for Evidence and Practice Improvement Agency for Healthcare Research and Quality Stephanie Chang, M.D., M.P.H. Elisabeth U. Kato, M.D., M.R.P. Director Task Order Officer Evidence-based Practice Center Program Center for Evidence and Practice Improvement Center for Evidence and Practice Improvement Agency for Healthcare Research and Quality Agency for Healthcare Research and Quality iii Key Informants In designing the study questions, the EPC consulted several Key Informants who represent the end-users of research. The EPC sought the Key Informant input on the priority areas for research and synthesis. Key Informants are not involved in the analysis of the evidence or the writing of the report. Therefore, in the end, study questions, design, methodological approaches, and/or conclusions do not necessarily represent the views of individual Key Informants. Key Informants must disclose any financial conflicts of interest greater than $10,000 and any other relevant business or professional conflicts of interest. Because of their role as end-users, individuals with potential conflicts may be retained. The TOO and the EPC work to balance, manage, or mitigate any conflicts of interest. The list of Key Informants who provided input to this report follows: Anthony Chow, M.D. Susan Promes, M.D. Division of Infectious Diseases University of California San Francisco University of British Columbia Vancouver School of Medicine Hospital San Francisco, CA Vancouver, BC, Canada David Rahn, M.D. Tyler Hughes, M.D. University of Texas Southwestern McPherson Hospital Dallas, TX McPherson, KS Martin P. Smith, M.D. Douglas Katz, M.D., FACR Beth Israel Deaconess Medical Center Vice Chairman of Research and Education Harvard Medical School Winthrop Radiology Associates Boston, MA Winthrop-University Hospital, Department of Radiology Daniel Dante Yeh, M.D. Director of Body CT, Winthrop-University Massachusetts General Hospital Hospital Division of Trauma, Emergency Surgery Mineola, NY and Surgical Critical Care Boston, MA Anupam Kharbanda, M.D., M.Sc. Director of Research, Emergency Services Associate PEM Fellowship Director Department of Pediatric Emergency Medicine Children’s Hospitals and Clinics of Minnesota Minneapolis, MN iv Technical Expert Panel In designing the study questions and methodology at the outset of this report, the EPC consulted several technical and content experts. Broad expertise and perspectives were sought. Divergent and conflicted opinions are common and perceived as healthy scientific discourse that results in a thoughtful, relevant systematic review. Therefore, in the end, study questions, design, methodologic approaches, and/or conclusions do not necessarily represent the views of individual technical and content experts. Technical Experts must disclose any financial conflicts of interest greater than $10,000 and any other relevant business or professional conflicts of interest. Because of their unique clinical or content expertise, individuals with potential conflicts may be retained. The TOO and the EPC work to balance, manage, or mitigate any potential conflicts of interest identified. The list of Technical Experts who provided input to this report follows: Anthony Chow, M.D. Susan Promes, M.D. Division of Infectious Diseases University of California San Francisco University of British Columbia Vancouver School of Medicine Hospital San Francisco, CA Vancouver, BC, Canada David Rahn, M.D. Tyler Hughes, M.D. University of Texas Southwestern McPherson Hospital Dallas, TX McPherson, KS Martin P. Smith, M.D. Douglas Katz, M.D., FACR Beth Israel Deaconess Medical Center Vice Chairman of Research and Education Harvard Medical School Winthrop Radiology Associates Boston, MA Winthrop-University Hospital, Department of Radiology Daniel Dante Yeh, M.D. Director of Body CT, Winthrop-University Massachusetts General Hospital Hospital Division of Trauma, Emergency Surgery Mineola, NY and Surgical Critical Care Boston, MA Anupam Kharbanda, M.D., M.Sc. Director of Research, Emergency Services Associate PEM Fellowship Director Department of Pediatric Emergency Medicine Children’s Hospitals and Clinics of Minnesota Minneapolis, MN v Peer Reviewers Prior to publication of the final evidence report, EPCs sought input from independent Peer Reviewers without financial conflicts of interest. However, the conclusions and synthesis of the scientific literature presented in this report do not necessarily represent the views of individual reviewers. Peer Reviewers must disclose any financial conflicts of interest greater than $10,000 and any other relevant business or professional conflicts of interest. Because of their unique clinical or content expertise, individuals with potential nonfinancial conflicts may be retained. The TOO and the EPC work to balance, manage, or mitigate any potential nonfinancial conflicts of interest identified. The list of Peer Reviewers follows: Roland E. Andersson, M.D., Ph.D. Tyler Hughes, M.D. Linköping University McPherson Hospital Linköping, Sweden McPherson, KS Richard Bachur, M.D. Anupam Kharbanda, M.D., M.Sc. Harvard Medical School Director of Research, Emergency Services Boston, MA Associate PEM Fellowship Director Department of Pediatric Emergency Esther H. Chen, M.D. Medicine University of California, San Francisco/San Children’s Hospitals and Clinics of Francisco General Hospital Minnesota Department of Emergency Medicine Minneapolis, MN Associate Professor of Emergency Medicine San Francisco, CA Martin P. Smith, M.D. Beth Israel Deaconess Medical Center Anthony Chow, M.D. Harvard Medical School Division of Infectious Diseases Boston, MA University of British Columbia Vancouver Hospital Daniel Dante Yeh, M.D. Vancouver, BC, Canada Massachusetts General Hospital Division of Trauma, Emergency Surgery Andrea S. Doria, M.D., Ph.D., M.Sc. and Surgical Critical Care The Hospital for Sick Children Boston, MA Toronto, ON, Canada Sherif Emil, M.D., C.M. Pediatric General and Thoracic Surgery The Montreal Children's Hospital McGill University Health Centre Montreal, QC, Canada vi Diagnosis of Right Lower Quadrant Pain and Suspected Acute Appendicitis Structured Abstract Background. The reliable identification of patients with abdominal pain who need surgical intervention for acute appendicitis can improve clinical outcomes and reduce resource use. The test performance and impact on outcomes of alternative diagnostic strategies are unclear. Study eligibility criteria. We searched PubMed®, Embase®, the Cochrane Central Register of Controlled Trials, and the Cumulative Index to Nursing and Allied Health Literature® to identify primary research studies meeting our criteria for cohort studies that reported information on test accuracy for the diagnosis of acute appendicitis or harms, and for comparative studies (randomized or nonrandomized) that reported information on patient-relevant outcomes and resource use (last search, August 6, 2014, for PubMed; August 12, 2014, for all other databases). Study appraisal and synthesis methods. A single investigator extracted data from each study and a second investigator verified extracted data from comparative studies; we also extracted data in duplicate for a sample of noncomparative studies. We performed Bayesian meta-analyses to estimate summary test performance using random-effects models; data on other outcomes were synthesized qualitatively. We also assessed the strength and applicability of the evidence. Results. Information on the test performance of diagnostic tests was available from 903 studies: clinical symptoms and signs (137 studies), laboratory tests (217 studies), imaging tests (519 studies), multivariable diagnostic scores (127 studies), and diagnostic laparoscopy (55 studies). Trials directly comparing diagnostic tests were too heterogeneous to support definitive conclusions; therefore, most of our results pertain to the test performance of individual tests. Clinical symptoms and signs, and laboratory tests had relatively low sensitivity and specificity when used in isolation. Their combination in multivariable scores performed somewhat better; however, the most studied scores were developed before the widespread use of imaging, thus lessening the applicability of their results to current practice. Computed tomography (CT) had high sensitivity (summary estimates ranging from 0.96 to 1) and specificity (0.91 to 1) in all populations of interest to this report; magnetic resonance imaging (MRI) had high sensitivity (0.94 to 1) but appeared to have variable specificity (0.86 to 1), mainly because of the smaller number of studies, which focused on its use for pregnant women. In adult populations, ultrasound (US) had lower sensitivity (0.85) and specificity (0.90) than CT and MRI, and produced more nondiagnostic scans. In children, the specificity of US was similar to that of CT (0.91 vs. 0.92), but CT had greater sensitivity (0.89 vs. 0.96); these results were based on a large number of studies (85 for US and 34 for CT). In the same patient population, MRI had a specificity of 0.96 and sensitivity of 0.97, but data were derived from only seven studies. Among pregnant women CT, MRI, and US had similar specificity (0.91, 0.98, and 0.95, respectively), but CT and MRI had higher sensitivity than US (0.99, 0.98, and 0.72, respectively). Information on diagnostic test performance among the elderly was limited. Studies of test performance were deemed to be at moderate risk of bias, mostly because of concerns about differential and incomplete verification. vii Information on patient-relevant outcomes and resource use was available from a small number of trials with moderate risk of bias that assessed heterogeneous comparisons between various tests and nonrandomized studies that did not appropriately adjust for potential confounding factors. Only a few studies reported information on harms, leading to concerns about selective outcome reporting. Therefore, no definitive conclusions could be drawn about patient-relevant outcomes or harms. Limitations. Patient-level data were unavailable, and information about study- or population- level characteristics was too limited to allow the identification of modifiers of test performance, patient-centered outcomes, or harms. Studies reported adverse events incompletely and did not provide details of outcome ascertainment methods. Conclusions. The literature on the diagnosis of acute appendicitis is large but consists almost exclusively of studies assessing the performance of individual tests. The evidence on individual tests indicates that imaging tests have adequate test performance, while clinical symptoms and signs and laboratory tests used in isolation have lower discriminatory capacity. The evidence is largely insufficient to support conclusions about comparative effectiveness for clinical outcomes because studies assessing more than two test strategies on the same population are few and have evaluated different test comparisons. More research is needed to evaluate the comparative performance and effectiveness of individual tests, test combinations, and integrated diagnostic algorithms; to identify potential modifiers; and to evaluate the impact of testing strategies on patient-relevant outcomes, resource use, and harms. Decision and simulation models using information from this review could inform the design of future studies and guide decisionmaking. PROSPERO registration number: CRD42013006480. viii Contents Executive Summary .................................................................................................................ES-1 Background ....................................................................................................................................1 Nature and Burden of the Condition ......................................................................................... 1 Diagnosis of Suspected Acute Appendicitis ............................................................................. 1 Importance of Accurate Diagnosis and Impact on Outcomes .................................................. 2 Special Considerations for the Diagnosis of RLQ Pain/Acute Appendicitis ............................ 3 Children................................................................................................................................3 Women of Reproductive Age ..............................................................................................3 Pregnant Women ..................................................................................................................3 Frail and Elderly Individuals ...............................................................................................3 Rationale for Evidence Review ................................................................................................ 4 Key Questions ........................................................................................................................... 4 Methods ...........................................................................................................................................6 AHRQ Task Order Officer, Stakeholder Input, and Review Protocol ..................................... 6 Analytic Framework ................................................................................................................. 6 Inclusion and Exclusion Criteria ............................................................................................... 8 Populations and Conditions of Interest ............................................................................... 8 Interventions ....................................................................................................................... 8 Comparators (Index and Reference Standard Tests) .......................................................... 8 Outcomes ............................................................................................................................ 8 Timing ................................................................................................................................. 9 Setting ................................................................................................................................. 9 Study Design and Additional Criteria ................................................................................. 9 Literature Search and Abstract Screening ................................................................................. 9 Study Selection and Eligibility Criteria .................................................................................. 10 Data Abstraction and Management ......................................................................................... 10 Assessment of the Risk of Bias of Individual Studies ............................................................ 11 Evidence Synthesis ................................................................................................................. 12 Grading the Strength of Evidence ........................................................................................... 13 Assessing Applicability .......................................................................................................... 14 Peer Review ............................................................................................................................ 15 Results ...........................................................................................................................................16 Key Question 1. What is the performance of alternative diagnostic tests, alone or in combination, for patients with right lower quadrant (RLQ) pain and suspected appendicitis? ...............................................................................................17 Included Studies With Information on Test Performance ................................................ 17 Test Performance of Clinical Symptoms and Signs (in Isolation) .................................... 17 Test Performance of Laboratory Tests .............................................................................. 42 Test Performance of Multivariable Diagnostic Scores ..................................................... 55 Test Performance of Imaging Tests .................................................................................. 64 Classifiers and Computer-Aided Diagnosis ...................................................................... 80 Test Performance of Diagnostic Laparoscopy .................................................................. 80 Comparative Assessments of Test Performance ............................................................... 82 ix