There has been much ado about hospital death rates lately, much of it focused on the Mid Staffs hospitals, where consistently high apparent death rates were repeatedly brushed aside and ignored. The issue at stake was the validity – or not – of certain statistics produced by Sir Jar and his operatives at HI5, the Dr Foster Intelligence Unit, and one statistic in particular, the HSMR.
The HSMR, or Hospital Standardised Mortality Ratio to give it its full name, is said by Sir Jar to offer a useful marker of a hospital’s performance, by providing a single figure that summarises how many patients leave the hospital feet-first. High value HSMRs suggest more stiffs than expected, low HSMRs indicate less stiffs than expected, compared to national figures. Unfortunately for Sir Jar, the method – quite apart from a myriad of other factors that might compromise validity – he uses to determine HSMRs suffers from a flaw that severely restricts its application. While an isolated HMSR can be compared to the ‘big picture’ – in other words, the standard population to which it is being compared – comparisons between hospitals, or even the same hospital over time, are prone to errors, which can render the results at best meaningless, at worst misleading.
Let us, for example, imagine two neighbouring hospitals, one in Brighton and one in Worthing. Both care as well (or not) as each other for their patients – and so their age band specific death rates are the same. Brighton, however, has a young population, and Worthing an older one, such that we might expect more deaths in Worthing – in other words, Worthing’s crude, overall mortality would be greater – not because the care was inferior, but simply because older people are more likely to die than younger people. We might find, for example, the following age identical specific death rates from hip fractures in each hospital (only three age bands shown for simplicity):
Brighton | Worthing | |||||
Age band | Patients | Deaths | Rate (%) | Patients | Deaths | Rate (%) |
40-49 | 6000 | 120 | 2 | 4000 | 80 | 2 |
50-59 | 5000 | 200 | 4 | 5000 | 200 | 4 |
60-69 | 4000 | 240 | 6 | 6000 | 360 | 6 |
However, because Worthing’s population is older, there are more deaths overall (640), compared to Brighton (560). Worthing’s crude (overall) mortality is also higher (4.3%) than Brighton’s (3.7%). Worthing – which has identical age specific death rates to Brighton – is being penalised in the summary figures (total number of deaths and crude mortality) simply because it has an older population. Standardisation – which is a specific example of the more general technique of adjusting for factors that we believe might distort, or confound, to use the technical term, our results – is a technique that produces a summary figure that is weighted, band by band, to allow for differing numbers in each age band between the two groups. If – if – the technique works, the difference between Worthing and Brighton should disappear in our age standardised result. Although we are standardising for age, we can, if we wish to, standardise for any number of factors – from height to hair colour, and anything else in between – and the same principles will apply.
As it happens, there are two ways we can standardise rates. The first, which is commoner (it does have some technical and practical advantages), and is indeed the method used by Sir Jar, is called indirect standardisation. Indirect standardisation involves taking national rates, and weighting them using our local population structure – such that age bands with more people contribute greater weight – before combining them into a single summary figures. The standard (reference) population we are going to use is the national population, so let’s add that data to our table:
Brighton | Worthing | Standard (national) data | |||||||
Age band | Patients | Deaths | Rate (%) | Patients | Deaths | Rate (%) | Patients | Deaths | Rate (%) |
40-49 | 6000 | 120 | 2 | 4000 | 80 | 2 | 30,000 | 300 | 1 |
50-59 | 5000 | 200 | 4 | 5000 | 200 | 4 | 40,000 | 1200 | 3 |
60-69 | 4000 | 240 | 6 | 6000 | 360 | 6 | 40,000 | 2000 | 5 |
Total | 15000 | 560 | 3.7 | 15000 | 640 | 4.3 | 110000 | 3500 | 3.18 |
At a glance, we can see that both Brighton and Worthing have higher age band specific and total death rates than out standard population (which means both will have HSMRs greater than 100); however – when we standardise for age, the apparent difference in death rates between the two hospitals should – should – disappear.
Indirect standardisation, recall, weights national rates using our local population structure. We therefore calculate our indirect HSMR by weighting (multiplying) our standard death rate (column 10) for each age band by the number of patients in each band (column 2 for Brighton, column 5 for Worthing) to arrive at our expected age specific deaths and then add them together to get out total expected deaths:
Brighton expected deaths = (1% x 6000) + (3% x 5000) + (5% x 4000) = 410
Worthing expected deaths = (1% x 4000) + (3% x 5000) + (5% x 6000) = 490
We then calculate our standardised ratio – the HMSR – by dividing observed deaths by the expected number of deaths, and (conventionally) multiplying by 100:
Brighton hip fracture indirect HSMR = (560/410) x 100 = 137
Worthing hip fracture indirect HSMR = (640/490) x 100 = 131
Both HSMRs should of course be the same (the underlying age-specific mortality is the same) but they are not. Indirect standardisation – as used by Sir Jay and Dr Foster – has failed, even misled us: now Brighton, rather than Worthing, emerges as the black sheep, when both should appear equal.
How can this be? The answer lies in how we have done the weighting. Indirect standardisation, we saw, relies on weighting national rates according to our local population structure. Population structure naturally varies (Brighton has more younger people), and as a result each hospital applies its own unique weighting pattern to the national rate. Brighton, for example, because it has less old people, has down-weighted the older age bands, causing less expected deaths – which in turn, because the observed number of deaths is fixed, gives rise to a higher HSMR. We are not, in fact, comparing like with like, and that is what causes the error.
Direct standardisation, however, avoids this error, by taking one single fixed reference standard population – which removes the variable weighting patterns – and uses it to weight our local death rates. To arrive at our directly standardised rates, we apply our local death rates (columns 4 and 7) in each band to our standard population numbers (column 8) and add them together and then divide by the total standard population (to get a rate rather than a ratio this time):
Directly standardised hip fracture mortality rates:
Brighton = ((2% of 30,000) + (4% of 40,000) +(6% of 40,000))/110,000 = 4.18%
Worthing = ((2% of 30,000) + (4% of 40,000) +(6% of 40,000))/110,000 = 4.18%
Voilá! The distorting effect of the differing age band weights has been removed from the equations – and we get the correct results. Both hospitals, as they should, have the same age adjusted death rates.
All of which is to say: caveat lector when indirect HSMRs are in the air.