Graph explanations

Explanation for the Graphical Display for the Outlier process

Outlier methodology

A random effects methodology is used to infer the outlier status of each hospital or surgeon. Further technical details are reported in a separate section. The results are displayed graphically using a forest plot.


Forest plot (see figure)

Vertical axis: Hospital or surgeon (GMC) identifier and in parentheses: number of patients treated and completeness rate of life status tracking percentage (i.e. the percentage of patients for whom survival data are available).

Horizontal axis: Percentage In-hospital Survival (Overall for UK, and Observed, Predicted, model-based Risk-Adjusted per hospital or surgeon).

More specifically, the following quantities are presented:

  1. Overall In-hospital Survival [dashed vertical line]: the overall proportion of patients who survive across all hospitals in the UK. In the attached graph it is around 98%.
  2. Observed Survival per hospital or surgeon [square]: The proportion of patients who survive in hospital after Cardiac Surgery for each hospital or surgeon.
  3. Predicted Survival per hospital or surgeon [cross]: The Predicted Survival, using the adjusted EuroSCORE model [1-4] to account for case-mix.  A high predicted survival (relative to overall UK survival) suggests that the hospital or surgeon performs Cardiac Surgery on relatively low-risk patients. 
  1. Survival probability (RE model) for outlier detection [full circle]: Survival for each hospital or surgeon, derived from a random effects model after accounting for case-mix. This estimate and the corresponding horizontal bar provide an indication of whether the hospital or surgeon is an ‘outlier’ after taking into account observed and predicted survival.

Quantities 1) and 2) do not require any statistical modelling. Calculation of the quantities 3) and 4) require the application of the adjusted EuroSCORE model [1-4] to predict the outcome and of a random effects model for the detection of outliers.


Display of outliers

Hospitals (or surgeons) with outcomes within limits of acceptable variability are assumed to demonstrate ‘usual’ or ‘normal performance’, or performance ‘within expected limits’.  A hospital is said to be an outlier when its performance deviates from usual, normal performance.

Black full circles indicate hospitals/surgeons with normal performance.

Purple/Blue full circles indicate hospitals/surgeons with worse/better performance than normal at the 2 Standard Deviation (SD) level. A hospital/surgeon with lower risk-adjusted survival than usual at the 2SD level is called an ‘Alert’.

Red/Green full circles indicate hospitals/surgeons with worse/better performance than normal at the 3SD level. A hospital/surgeon with lower risk-adjusted survival than usual at the 3SD level is called an ‘Alarm’.  For the purposes of public reporting only 3SD outliers are displayed on the published forest plots.  (2SD outliers are notified by SCTS and NICOR, but their results are not displayed).

The confidence intervals for the Survival probability (RE model) for outlier detection (solid horizontal bars) indicate whether a hospital or surgeon is a potential outlier at a given significance level (typically either 2 or 3 standard deviations (SD). For the 3SD significance level:

  • If the confidence interval for a hospital or surgeon crosses the vertical Overall Survival dashed line, then the performance of that hospital or surgeon does not deviate from normal performance.
  • If the confidence interval fails to cross the vertical Overall Survival dashed line, then the hospital or surgeon is either performing significantly better (Green), or significantly worse (Red) than normal. Such hospitals or surgeons are potential outliers at the 3 SD level.

The length of each confidence interval relates to the estimate of the Survival probability (RE model) for each hospital or surgeon, after accounting for case-mix, relative to the variation in survival across all hospitals. The confidence intervals are not symmetric due to the inverse log-odds transformation. The length of the confidence interval shortens as survival approaches 100%.

Note: The validity of the outlier process relies on having an adequate number of patients per hospital or surgeon. Although there is no exact guideline for this figure, simulation studies suggest that for the settings considered (outcome prevalence of 2%) the minimum number of patients to ensure the validity of the tests is 200. Thus, hospitals with fewer than 200 patients will not have confidence intervals placed around their survival estimate and will not be assigned an outlier status.  For individual surgeons the minimum case number used is 100 cases (otherwise a large proportion of surgeons would be excluded from the outlier process).  The simulation studies still show that this gives a valid result, however the certainty of the outcome may be reduced.  This needs to be taken into account for any surgeon assigned as an Alert or Alarm outlier.



  1. Roques F, Nashef SAM, Michel P, Gauducheau E, de Vincentiis C et al. Risk factors and outcome in European cardiac surgery: analysis of the EuroSCORE multinational database of 19030 patients. European Journal of Cardio-Thoracic Surgery, 1999, 15:816–22.
  2. Nashef SAM, Rogues F, Michel P, Gauducheau E, Lemeshow S, Salamon R. European system for cardiac operative risk evaluation (EuroSCORE). European Journal of Cardio-Thoracic Surgery, 1999, 16:9–13.
  3. Gogbashian A, Sedrakyan A, Treasure T. EuroSCORE: a systematic review of international performance. European Journal of Cardio-Thoracic Surgery, 2004, 25:695–700.
  4. Hickey GL, Grant SW, Murphy GJ, Bhabra M, Pagano D, McAllister K et al. Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models. European Journal of Cardio-Thoracic Surgery, 2013, 43:1146–52.


Example Forest Plot



Random effects model (with some technical details)

A random effects model is an extension of a standard regression model, which additionally allows for the fact patients come from different hospitals (are ‘clustered’ within hospitals).

We use a (logistic) regression model that relates observed mortality to predicted mortality. Differences between these quantities (for each hospital) are captured by ‘random intercepts’.  Some degree of variation in these random intercepts is expected (e.g. due to natural variability and unmeasured hospital-level characteristics).  Our statistical testing aims to identify those random intercepts that are abnormally large or small. For presentation purposes these random intercepts are transformed to the probability scale.


Review of the NICOR processes for detection of outliers



The National Institute for Cardiovascular Outcomes Research (NICOR) was formed in 2011, bringing together the six main national cardiovascular audits:

Audit Professional Society(ies)
National Congenital Heart Disease Audit (NCHDA) British Congenital Cardiac Association (BCCA); Society for Cardiothoracic Surgery in Great Britain and Ireland (SCTS)
Myocardial Ischaemia National Audit Project (MINAP) British Cardiovascular Society (BCS)
National Audit of Percutaneous Coronary Intervention (NAPCI) British Cardiovascular Intervention Society (BCIS)
National Adult Cardiac Surgery Audit (NACSA) Society for Cardiothoracic Surgery in Great Britain and Ireland (SCTS)
National Heart Failure Audit (NHFA) British Society for Heart Failure (BSH)
National Audit of Cardiac Rhythm Management (NACRM) British Heart Rhythm Society (BHRS)

The audits are commissioned by the Healthcare Quality Improvement Partnership (HQIP) and funded by NHS England and GIG/Cymru NHS Wales and, for some audits, NHS Scotland. Funding for participation from Health and Social Care in Northern Ireland and the private sector is being sought.

The UK Transcatheter Aortic Valve Implantation (UK TAVI) Registry was developed in 2007/08 and is also managed by NICOR.

The individual sub-specialty (‘domain’) registries had been developed by the relevant professional societies and in the late 1990s/early 2000s, individual patient records were captured through the Central Cardiac Audit Database (CCAD). The database was incorporated into NICOR in 2011.

NICOR was initially hosted at University College London (UCL). Progress was made in the way that data were collected and analysed, annual reports were prepared and there was development of a number of feedback systems to the hospitals participating in the audit. The audits have many functions, including quality assurance and quality improvement within the health services but they also provide a historical review of the management of cardiovascular conditions over time, and the collated data are available for observational research. In 2017, NICOR moved and is now hosted by the Barts Health NHS Trust.


Risk models and comparative assessment of hospitals and individual operators

To assess how a hospital or even an individual operator is performing, one could simply assess raw outcomes (such as mortality following a procedure) against the national observations.  However because of differences in case mix at different centres or by different operators, adjustment is needed to try to compare like with like, and so provide for a more accurate assessment of comparative performance.  Risk models have been developed and published that are good at accounting for differences in case mix.  Examples include EuroSCORE and iterations for cardiac surgery, a 30 day mortality model following PCI based on UK data, and a 30 day mortality model following TAVI.

These models have good calibration and discrimination when assessing overall outcomes of populations, but there are complexities when they are used to try to compare outcomes by centre or by operator, and particularly when they are used to try to find outlier performance. NICOR developed statistical methods for comparative outcome analysis working closely with the specialist societies and taking detailed advice from both Professor Sir David Spiegelhalter, University of Cambridge, and also from Professor Sir Nick Black, Professor of Health Services Research at the London School of Hygiene and Tropical Medicine. The SCTS led the way in publishing risk-adjusted outcomes for every cardiac surgeon in the UK.

In 2013, the then-NHS Medical Director, Sir Bruce Keogh (who had worked with Professor Ben Bridgewater and others on developing the SCTS programme) launched the Clinical Outcomes Publication (COP) programme, to be used for 10 specialties (now 24). This was an NHS England initiative, managed by HQIP. HQIP has provided additional guidance on the methodology.1,2

As part of a governance review in 2015/16, NICOR was recommended to review the statistical processes being used for the detection of outliers. NICOR therefore invited the Department of Statistical Science at UCL, led by Professor Rumana Omar, to lead a statistical review of the methodology and the coding required for analysis. As of 2019 this work is on-going and has been led by Professor Omar, Dr Gareth Ambler, Senior Lecturer at the UCL DSS, and Dr Menelaos Pavlou, Lecturer at the UCL DSS.


Statistical methodology

Understanding variation in performance in clinical specialties is complex, and there is no one accepted standard methodology. The methods previously used were based on funnel plot analysis, where the observed outcomes were compared with expected outcomes, while accounting for case mix and random variation.  With small numbers of procedures the statistical variation in observed outcomes is greater than with large volumes, and this accounts for the funnel shape of the outlier boundaries when volume is plotted against outcome.

There are several recognised limitations of this method.  These include ‘over-dispersion’ – when the observed variation (and hence scatter on the plot) is larger than would expected from a binomial distribution.  There is also difficulty in making multiple comparisons – if you compare enough observations you would expect to incorrectly identify an outlier by statistical chance.  Models also tend to drift with time so, for example, EuroSCORE started over-predicting risk soon after it was published.  There are also issues with clustering, where the difference between centres’ case mix will interact with the differences in operator outcomes between centres.

Many of these problems can be addressed by random effects modelling, a technique that has only become possible as computing power has increased in recent years.   These new methods have also been recommended by Prof David Spiegelhalter and others.  While it successfully addresses several methodological issues, the results of analysis are not well suited to display in a familiar funnel plot, and so we are developing new ways to display data, to try to maintain some intuitive appreciation of the information without misleading the observer.

Having done a full literature search on the methodology, Dr Pavlou and colleagues have developed a statistical process to incorporate this methodology into the NICOR datasets. A review was made into the coding of the method into the programmes that run the analyses.  These methods have now been incorporated into the NICOR NACSA and NAPCI datasets to produce the COP results. The method will also be applied to all similar analyses where risk-adjusted outcomes will be assessed, whether at hospital- or individual operator-level.  In addition this has been incorporated into the NICOR Standard Operating Procedure for detection of outliers.3

Attached is a presentation by Dr Pavlou which covers some of the background and development of these methods and an example of the plots with explanatory notes is provided.

Mark de Belder                 Chair, NCAP Operational and Methodology Group

Rodney Franklin               Clinical Lead for National Congenital Heart Disease Audit

Andrew Goodwin            Clinical Lead for National Adult Cardiac Surgery Audit

Peter Ludman                   Clinical Lead for National Audit of Percutaneous Coronary Intervention

Theresa McDonagh         Clinical Lead for National Heart Failure Audit

Francis Murgatroyd         Clinical Lead for National Audit of Cardiac Rhythm Management

Uday Trivedi                       Society for Cardiothoracic Surgery in Great Britain and Ireland

Clive Weston                     Clinical Lead for Myocardial Ischaemia National Audit Project

Menelaos Pavlou             University College London Department of Statistical Science

August 2019



  1. Department of Health. Detection and management of outliers’ guidance prepared by the national Clinical Audit Advisory Group. 2011.
  2. Detection and management of outliers for National Clinical Audits. Guidance prepared by National Clinical Audit Group/HQIP (2011). Updated by HQIP in consultation with CQC, NHS England, NAGCAE, NHS Improvement (May 2017).
  3. NICOR Standard Operating Policy: NCAP Outlier Policy, version 5, 24th June 2019


SOP Outlier Policy

A detailed explanation about the outlier policy can be found here: