[Skip to Content]
[Skip to Content Landing]

Discrimination and Calibration of Clinical Prediction ModelsUsers’ Guides to the Medical Literature

Educational Objective
To understand various properties of clinical prediction models and how to use them in clinical practice.
1 Credit CME

Accurate information regarding prognosis is fundamental to optimal clinical care. The best approach to assess patient prognosis relies on prediction models that simultaneously consider a number of prognostic factors and provide an estimate of patients’ absolute risk of an event. Such prediction models should be characterized by adequately discriminating between patients who will have an event and those who will not and by adequate calibration ensuring accurate prediction of absolute risk. This Users’ Guide will help clinicians understand the available metrics for assessing discrimination, calibration, and the relative performance of different prediction models. This article complements existing Users’ Guides that address the development and validation of prediction models. Together, these guides will help clinicians to make optimal use of existing prediction models.

Sign in to take quiz and track your certificates

Buy This Activity

JN Learning™ is the home for CME and MOC from the JAMA Network. Search by specialty or US state and earn AMA PRA Category 1 CME Credit™ from articles, audio, Clinical Challenges and more. Learn more about CME/MOC

Article Information

Corresponding Author: Ana Carolina Alba, MD, PhD, Toronto General Hospital, 585 University Ave, 6EN-246, Toronto, ON M5G 2N2, Canada (carolina.alba@uhn.ca).

Accepted for Publication: August 15, 2017.

Author Contributions: Dr Alba had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Alba, Walsh, Hanna, Iorio, Devereaux, McGinn, Guyatt.

Acquisition, analysis, or interpretation of data: Alba, Agoritsas, Iorio.

Drafting of the manuscript: Alba, Agoritsas, McGinn.

Critical revision of the manuscript for important intellectual content: Alba, Walsh, Hanna, Iorio, Devereaux, Guyatt.

Obtained funding: Alba.

Administrative, technical, or material support: Alba, McGinn.

Supervision: Guyatt.

Conflict of Interest Disclosures: The authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Devereaux reported receiving grant funding from Abbott Diagnostics, Boehringer Ingelheim, Covidien, Octapharma, Roche Dignostics, and Stryker. No other disclosures were reported.

Stone  NJ, Robinson  JG, Lichtenstein  AH,  et al; American College of Cardiology/American Heart Association Task Force on Practice Guidelines.  2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines.  Circulation. 2014;129(25)(suppl 2):S1-S45.PubMedGoogle ScholarCrossref
Nayor  M, Vasan  RS.  Recent update to the US cholesterol treatment guidelines: a comparison with international guidelines.  Circulation. 2016;133(18):1795-1806.PubMedGoogle ScholarCrossref
Allen  LA, Yager  JE, Funk  MJ,  et al.  Discordance between patient-predicted and model-predicted life expectancy among ambulatory patients with heart failure.  JAMA. 2008;299(21):2533-2542.PubMedGoogle ScholarCrossref
Muntwyler  J, Abetel  G, Gruner  C, Follath  F.  One-year mortality among unselected outpatients with heart failure.  Eur Heart J. 2002;23(23):1861-1866.PubMedGoogle ScholarCrossref
Lund  LH, Edwards  LB, Kucheryavaya  AY,  et al.  The registry of the International Society for Heart and Lung Transplantation: thirty-second official adult heart transplantation report—2015; focus theme: early graft failure.  J Heart Lung Transplant. 2015;34(10):1244-1254.PubMedGoogle ScholarCrossref
Saxon  LA, Hayes  DL, Gilliam  FR,  et al.  Long-term outcome after ICD and CRT implantation and influence of remote device follow-up: the ALTITUDE survival study.  Circulation. 2010;122(23):2359-2367.PubMedGoogle ScholarCrossref
Oldgren  J, Hijazi  Z, Lindbäck  J,  et al; RE-LY and ARISTOTLE Investigators.  Performance and validation of a novel biomarker-based stroke risk score for atrial fibrillation.  Circulation. 2016;134(22):1697-1707.PubMedGoogle ScholarCrossref
Klein  KB, Stafinski  TD, Menon  D.  Predicting survival after liver transplantation based on pre-transplant MELD score: a systematic review of the literature.  PLoS One. 2013;8(12):e80661.PubMedGoogle ScholarCrossref
Regoli  F, Scopigni  F, Leyva  F,  et al; Collaborative Study Group.  Validation of Seattle Heart Failure Model for mortality risk prediction in patients treated with cardiac resynchronization therapy.  Eur J Heart Fail. 2013;15(2):211-220.PubMedGoogle ScholarCrossref
Sala  I, Illán-Gala  I, Alcolea  D,  et al.  Diagnostic and prognostic value of the combination of two measures of verbal memory in mild cognitive impairment due to Alzheimer’s disease.  J Alzheimers Dis. 2017;58(3):909-918.PubMedGoogle ScholarCrossref
Shen  J-H, Chen  H-L, Chen  J-R, Xing  J-L, Gu  P, Zhu  B-F.  Comparison of the Wells score with the revised Geneva score for assessing suspected pulmonary embolism: a systematic review and meta-analysis.  J Thromb Thrombolysis. 2016;41(3):482-492.PubMedGoogle ScholarCrossref
Aaronson  EL, Chang  Y, Borczuk  P.  A prediction model to identify patients without a concerning intraabdominal diagnosis.  Am J Emerg Med. 2016;34(8):1354-1358.PubMedGoogle ScholarCrossref
Plüddemann  A, Wallace  E, Bankhead  C,  et al.  Clinical prediction rules in practice: review of clinical guidelines and survey of GPs.  Br J Gen Pract. 2014;64(621):e233-e242.PubMedGoogle ScholarCrossref
McGinn  T.  Putting meaning into meaningful use: a roadmap to successful integration of evidence at the point of care.  JMIR Med Inform. 2016;4(2):e16.PubMedGoogle ScholarCrossref
Wallace  E, Uijen  MJ, Clyne  B,  et al.  Impact analysis studies of clinical prediction rules relevant to primary care: a systematic review.  BMJ Open. 2016;6(3):e009957.PubMedGoogle ScholarCrossref
Collins  GS, Reitsma  JB, Altman  DG, Moons  KGM.  Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.  BMJ. 2015;350:g7594. doi:10.1136/bmj.g7594PubMedGoogle ScholarCrossref
McGinn  TG, Guyatt  GH, Wyer  PC, Naylor  CD, Stiell  IG, Richardson  WS; Evidence-Based Medicine Working Group.  Users’ guides to the medical literature: XXII: how to use articles about clinical decision rules.  JAMA. 2000;284(1):79-84.PubMedGoogle ScholarCrossref
McGinn  T, Wyer  PC, McCullagh  L,  et al. Clinical prediction rules. In: Guyatt  G, Rennie  D, Meade  MO, Cook  DJ, eds.  Users’ Guides to the Medical Literature. New York, NY: McGraw-Hill; 2014.
Harrell  FE  Jr, Lee  KL, Mark  DB.  Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.  Stat Med. 1996;15(4):361-387.PubMedGoogle ScholarCrossref
Pencina  MJ, D’Agostino  RB  Sr.  Evaluating discrimination of risk prediction models: the C statistic.  JAMA. 2015;314(10):1063-1064.PubMedGoogle ScholarCrossref
Hosmer  DW, Lemeshow  S. Assessing the fit of the model. In: Hosmer  DW, Lemeshow  S, eds.  Applied Logistic Regression. 2nd ed. New York, NY: John Wiley & Sons; 2000:143-202.Crossref
D’Agostino  RB, Nam  B-H. Evaluation of the performance of survival analysis models: discrimination and calibration measures. In: Balakrishnan  N, Rao  CR, eds.  Handbook of Statistics v23: Advances in Survival Analysis. Amsterdam, the Netherlands: Elsevier; 2004.
Gerds  TA, Cai  T, Schumacher  M.  The performance of risk prediction models.  Biom J. 2008;50(4):457-479.PubMedGoogle ScholarCrossref
Hanley  JA, McNeil  BJ.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.  Radiology. 1982;143(1):29-36.PubMedGoogle ScholarCrossref
Heagerty  PJ, Zheng  Y.  Survival model predictive accuracy and ROC curves.  Biometrics. 2005;61(1):92-105.PubMedGoogle ScholarCrossref
Barili  F, Pacini  D, Rosato  F,  et al.  In-hospital mortality risk assessment in elective and non-elective cardiac surgery: a comparison between EuroSCORE II and age, creatinine, ejection fraction score.  Eur J Cardiothorac Surg. 2014;46(1):44-48.PubMedGoogle ScholarCrossref
Cook  NR, Ridker  PM.  Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures.  Ann Intern Med. 2009;150(11):795-802.PubMedGoogle ScholarCrossref
Van Calster  B, Nieboer  D, Vergouwe  Y, De Cock  B, Pencina  MJ, Steyerberg  EW.  A calibration hierarchy for risk models was defined: from utopia to empirical data.  J Clin Epidemiol. 2016;74:167-176.PubMedGoogle ScholarCrossref
Sartipy  U, Dahlström  U, Edner  M, Lund  LH.  Predicting survival in heart failure: validation of the MAGGIC heart failure risk score in 51,043 patients from the Swedish Heart Failure Registry.  Eur J Heart Fail. 2014;16(2):173-179.PubMedGoogle ScholarCrossref
Wessler  BS, Lai Yh  L, Kramer  W,  et al.  Clinical prediction models for cardiovascular disease: Tufts predictive analytics and comparative effectiveness clinical prediction model database.  Circ Cardiovasc Qual Outcomes. 2015;8(4):368-375.PubMedGoogle ScholarCrossref
Cook  NR.  Use and misuse of the receiver operating characteristic curve in risk prediction.  Circulation. 2007;115(7):928-935.PubMedGoogle ScholarCrossref
Pencina  MJ, D’Agostino  RB  Sr, D’Agostino  RB  Jr, Vasan  RS.  Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.  Stat Med. 2008;27(2):157-172.PubMedGoogle ScholarCrossref
Sheth  T, Chan  M, Butler  C,  et al; Coronary Computed Tomographic Angiography and Vascular Events in Noncardiac Surgery Patients Cohort Evaluation Study Investigators.  Prognostic capabilities of coronary computed tomographic angiography before non-cardiac surgery: prospective cohort study.  BMJ. 2015;350:h1907.PubMedGoogle ScholarCrossref
Vickers  AJ, Pepe  M.  Does the net reclassification improvement help us evaluate models and markers?  Ann Intern Med. 2014;160(2):136-137.PubMedGoogle ScholarCrossref
Vickers  AJ, Van Calster  B, Steyerberg  EW.  Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.  BMJ. 2016;352:i6.PubMedGoogle ScholarCrossref
Daniels  LB, Clopton  P, deFilippi  CR,  et al.  Serial measurement of N-terminal pro-B-type natriuretic peptide and cardiac troponin T for cardiovascular disease risk assessment in the Multi-Ethnic Study of Atherosclerosis (MESA).  Am Heart J. 2015;170(6):1170-1183.PubMedGoogle ScholarCrossref
If you are not a JN Learning subscriber, you can either:
Subscribe to JN Learning for one year
Buy this activity
If you are not a JN Learning subscriber, you can either:
Subscribe to JN Learning for one year
Buy this activity
With a personal account, you can:
  • Access free activities and track your credits
  • Personalize content alerts
  • Customize your interests
  • Fully personalize your learning experience
Education Center Collection Sign In Modal Right

Name Your Search

Save Search
With a personal account, you can:
  • Track your credits
  • Personalize content alerts
  • Customize your interests
  • Fully personalize your learning experience

Lookup An Activity


My Saved Searches

You currently have no searches saved.

With a personal account, you can:
  • Access free activities and track your credits
  • Personalize content alerts
  • Customize your interests
  • Fully personalize your learning experience
Education Center Collection Sign In Modal Right
State Requirements