Skip to main content

Risk stratification of patients admitted to hospital with COVID-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score

Our take —

This paper details a promising tool for risk stratification of patients hospitalized with COVID-19. The risk score includes 8 variables (age, sex, number of comorbidities, respiratory rate, peripheral oxygen saturation, Glasgow coma scale, urea, and C-reactive protein) that are often readily available in developed healthcare settings. The model performed well (better than 15 existing prediction tools) in development and validation cohorts, but will require validation in each new setting prior to clinical implementation.

Study design

Prospective Cohort

Study population and setting

This prospective cohort study details the development of a risk stratification tool to predict in-hospital mortality, named the 4C (Coronavirus Clinical Characterization Consortium) Mortality Score. The derivation cohort included 35,463 patients (32.2% mortality rate, median age 73, 42% female, 76% with at least one comorbidity) enrolled from February 6 – May 20, 2020, and the validation cohort included 22,361 patients (30.1% mortality, median age 76, 46% female, 77% with at least one comorbidity). Eligible participants were adult patients (18+ years) admitted to one of 260 participating hospitals in England, Scotland, and Wales who had a high likelihood of COVID-19 infection. Relevant predictors were selected prior to model development based on factors that have consistently been reported as clinically important in previous studies, that are commonly available at presentation, and that were measured on the day of hospital admission.

Summary of Main Findings

After rigorous model selection and coefficient estimation procedures, variables included in the final model (age group, sex, number of comorbidities, respiratory rate, peripheral oxygen saturation, Glasgow coma scale, urea, and C-reactive protein) were scaled to point values for the final prognostic index (the 4C Mortality Score). The 4C Mortality Score performed well in derivation and validation cohorts with good discrimination (the ability of a model to assign higher predicted risk to those who have the outcome vs. those who do not) evidenced by area under the curve (AUC) of 0.79 and 0.77, respectively, along with near perfect calibration (agreement between observed and predicted outcome risk) in both cohorts, and low brier score (average squared difference between observed and predicted outcomes; lower is better) of 0.17 in both cohorts. The authors defined 4 risk groups (low, intermediate, high, and very high) with mortality of 1.2%, 9.9%, 31.4% and 61.5% in the validation cohort, respectively. Within the validation cohort, the study also compared the 4C Mortality model to 15 previously published risk stratification scores, and demonstrated comparable, even slightly favorable, discrimination of this new model relative to all others.

Study Strengths

This was a large study and data were prospectively collected. The model uses clinical data that are routinely collected in developed country healthcare settings at the time of hospital admission. The study adheres to the TRIPOD guidelines, which set a standard for transparent reporting of model development and validation of predication models. Model performance was evaluated by several metrics, and the model performed well in the development and validation cohort and fared well in comparison to previously established prediction tools. The use of a priori variable selection, penalized regression methods to prevent overfitting, and multiple imputation of missing data reflect methodologic rigor beyond that of previous studies. The discrimination (AUC) of the model was evaluated by sex and ethnic group.


To account for the possibility that diagnostic tests may not be universally available, the enrollment criteria did not require a diagnostic test confirming infection with SARS-CoV-2. Categorization of continuous predictors may result in loss of information and decreased predictive performance, but also enables quicker application of the tool and is arguably more clinically useful. The performance of the model was not evaluated in country or age-specific subgroups, which limits generalizability and underscores the need for the score to be validated in a new setting prior to clinical use. Additionally, some parameters in the score may not be available in resource-limited settings. Patients were followed for at least four weeks, but those without a defined outcome were considered alive. Predictive performance of the model may change with longer follow-up.

Value added

This is one of the highest quality and largest studies to develop a prediction model for COVID-19 mortality.

This review was posted on: 24 September 2020