You are hereBlogs / dr-no's blog / Staggering Catastrophes

Staggering Catastrophes

Posted by Dr No on 04 March 2014

sirens.jpgAs a doctor who has dabbled in epidemiology, Dr No is not unaware of the siren song of Greater minds, including epideiology’s Einstein, have frothed at the prospect of the data orgy to be had, only to have it dawn that theirs was a premature cigar. Yet even when left staggering at the catastrophes revealed, a hard core group still want to happen, the idea being that if enough corks are inserted, then nothing will leak.

If only! Dr No remains persuaded that the call of is indeed the song of a siren balanced on dangerous rocks. However alluring the song, the rocks remain; many rocks, but four stand out as especially dangerous.

The first is the impairment of doctor-patient trust, with the consequence that patients will withhold important medical information, delay seeing a doctor or even worse not see a doctor at all, from fear that personal sensitive data will sooner or later end up somewhere the patient would prefer it did not. The marginal are those most at risk, but who can really be immune to a shiver of apprehension come the day when, as we talk to our doctor, we know we are also at the same time talking to government computer?

Secondly, once the data is uploaded, it cannot be unloaded - short of taking the nuclear option - which places us at risk of the unknown unknowns of the future. We cannot know what future laws might be passed, or how those laws might make legitimate use of our personal sensitive data to disadvantage us. Imagine a world in which the DWP was allowed to tap into Or the CPS be hooked up to the data? Thanks – but no thanks.

Then there is the catastrophic error of blurring public service with commercial gain. The tinkle of the tills is alien to the world public service and research; so alien, in fact, that it is toxic. Add in that the infamous outsourcing company ATOS with its DWP connections were chosen to operate the data hoovers, and the tinkle of the tills starts to turn into a scream of profit. If does go ahead, the data will be compromised – perhaps fatally – by absence of data from the many who, hearing the scream of profit, decided to opt out. Selection bias (the term for errors caused by selecting an unrepresentative sample) is pervasive, but knowingly using a data set one knows is self-selected at best risks bad science, at worst is unethical.

Lastly, there is the inevitable porosity of big data sets. Even if the competent were in charge, which clearly they are not given the ongoing comedy or errors, misinformation, PR disasters and general goonery now unfolding, sooner or later, by accident or design, some personally identifiable data is going to leak. Somewhere, sooner or later, some poor blighters will find themselves drowning in a stinking leak of their own personal data.

The Doomsday scenario? The government collects your data, flogs it to insurers and then destroys the NHS. All those confidential things you told your GP years ago come back to bite you – big-time. Surely that couldn’t happen. Or could it?


I have dabbled around epidemiology when considering the problem of whether it is, or will be in the future, a pseudo-science. The quality of today’s epidemiological research is very low and it all too often, so to speak, follows the money. As you say, is going to be bias, but it will also be very big which means you can just about do anything you like with it. Confirmation bias will be rife as industries, insurers, academics and governments use computers to drill down into the data to find what they want to find. Geneticists (especially of behavioural type) will be able to confirm their theories by identifying casual links in the white noise of the data. It may pass us by with nothing more than an increase in silly newspaper headlines and the ‘Today’ programme’s obsession with nutty surveys about how eating bananas can double your chances of having hippopotomonstrosesquipedaliophobia, or some such nonsense. On the other hand, we should perhaps remember that it was the growth of data and statistical methodology that gave rise to eugenics about century ago. Pretty certain that the millions that back the human enhancement movement will be all over this pile of goods.

Keith - Dr No always wondered what Jimbo's diagnosis was - now he thinks he might know!

There seem to be two (perhaps more) problems these days with epidemiology. The first is when it strays from what it knows it can do (and what the limitations of the methods are) into numerology otherwise known in posh circles as modelling. The Doll/Peto smoking study was epidemiology doing what it can do, and doing it well; the Sheffield Alcohol Numerology Team ('SHANT') are a well known example of the latter. The other, as you rightly point out, is the oceanic fishing opportunities provided by huge data sets. Dr No was always taught to decide on the question and then get the data and do the study. Doing it the other way round (do lots of studies on huge data sets and then decide what the question is) is bad science - ironic really, that the normally sound hound of bad science is rather, though naturally one is confident that any research he might do with data would be strictly not of the pelagic variety.

The genetic/eugenic 'potential' has also occured to Dr No - he hinted indirectly to it in a comment to the last post. It may even be that in time the (mis)use of genetic data via turns out to be one of the biggest monsters of them all.

Checking that SHANT is housed under a Public Health/Epidemiology umbrella (it is, but for a moment Dr No thought it might be under that other epic branch of numerology, economics), it occurred to him it might be interesting to see how many common doctors (as opposed to proper PhD doctors) there are on 'the team'. The answer, unless Dr No is mistaken, is none.

A while back there was a bit of a to-do in the Faculty of Public Health Medicine. Facing extinction, the then medical faculty opened its doors to all sorts of ologists so as to boost numbers. While a case can be made that public health is a broad church (a case Dr No would not disagree with), SHANT appear to have taken this process as far as it can go: no direct medical input. Instead, it appears to be a cabal of ologists, engineers, mathematicians and modellers, with the later being of the computer not Airfix tendency.

The lack of a medical anchor may be one of the factors behind the possible trend in epidemiology towards flights of fancy powered by what-if spreadsheets. Epidemiologists would do well to stick to practical measures like taking handles off water pumps, instead of generating models which have as much chance of going anywhere as Dr No's boyhood Airfix Saturn V rocket does of reaching the moon.

Speaking as a PhD type doctor in the philosophy of IT and AI, I must say I can’t see why SHANT has so many modellers and other PhD types all supported by millions upon millions of funding. I have had a quick look at their ‘Model-based appraisal’ for minimum alcohol pricing in Scotland and was immediately struck by the use of the word ’estimated’(in about 25 pages of text it is used 131 times). These estimates and the results the computer produces are often expressed to two decimal places, which of course does not look like an estimate (computer models always look their best to two decimal places).

For sure this type of research is all ’estimates’, but how many estimates can you put into a computer model before you start to think computer modelling is a waste of time and money? Not sure, if any, what kind of testing the model has had and whether it has been tested against humans that are very good at working with guesstimates and removing pump handles.

I am reminded of something Margaret Masterman wrote in 1970:

Do we not see premature ‘normal-science’ (which is also called ‘phoney science’ and ‘pseudo-science’ by soured critics) setting in all round us in a nightmare manner, in the new sciences, especially where computers can be grandiosely used to give spurious impressions of genuine scientific efficiency? In the end phoney scientific normal-science lines collapse, or fail to yield any results, or topple, or evaporate - or so one hopes .

This soured critic is still keeping his fingers crossed.

Dr No has written on minimum alcohol pricing a number of times (a search for 'minimum alcohol pricing' via the search box above will find most of them). The fundamental flaw in SHANT's models and so recommendations is they tend to 'prefer' that heavy drinkers are much more price elastic (numerology mumbo-jumbo for how much a consumer alters consumption according to price) than in reality they are. If hard core drinkers are relatively price inelastic (big prices increases cause little reduction in drinking) then there is a worrying potential for blowback (Dr No doesn't mean a chunder after too much frothy lager) especially in less well off households. Money that would have gone on essentials/the kids will end up being diverted to booze. Not exactly a good idea - but, then, you employ monkeys, you get peanuts.

Information about you and the care you receive is shared, in a SECURE system, by healthcare staff to support your treatment and care. “

Interesting then that dear Jeremy will unveil new laws to ensure that medical records are only released when it is clear that there are ‘health benefits’ rather than commercial concerns…so ‘it’ wasn’t that SECURE then?

Organisations that breach data protection laws – one strike and you’re off – whoopee-do – will of course be too late then – a breach is a breach – and undoable…

(I am going to bury my head in the sand...)