Today, we learned that swabs can take up to 10 days to get a test result and test results can take up to 10 days to show in official data (link to study in Lombardy; data from other some regions link and link).
If we add the fact that in Italy swabs are taken almost only on symptomatic patients and the incubation period is 3-14 days (to be conservative), it means that official data could lag real number of cases by up to 14 + 20 = 34 days! One full month!
Of course, this is the worst case scenario. In Lombardy, the average time for a result to show in official data is 3.6 days. In a best case scenario when a swab is administered to a symptomatic patient with the shortest incubation period and is immediately processed, we go down to a lag of 3 + 0 + 4 = 7 days.
To sum it up so far: official data on case count lags reality at best by 7 days and at worst by a more than a full month (34 days).
The larger problem
The larger problem is that this delay is not constant across results. If all results were delayed by the same amount of days, we would still be able to extract trends. Delayed ones, yes, but still trends. However, the delay is not constant. One test might have a 7 days delay and the next one 31. Trends based on official data might be wrong.
One could argue that the delay could be “averaged” and, over large numbers, it can be approximated to constant. The problem is that delays cluster. One province might be processing data faster and another one might swabs. One week we might have timely tests and the other week we might lack reagents and incur large delays.
These problems are in addition to those I already mentioned regarding testing (link), notably that number of tests made does not equal number of people tested. To be discharged from the hospital, people might require multiple tests to validate their recovery.
And talking about recoveries, it emerged that some Italian regions count as recoveries people who tested negative whereas others count as recoveries people who got discharged from the hospital but are still pending a validation by test of their recovery while staying in home isolation (link).
Put all of this together and it’s a mess. I honestly don’t understand how people can still look at charts and trends without asking themselves how good is the data.
The habit of analyzing data without first validating it is an instance of ludic fallacy (link), in which, as in an exercise from a statistic textbook, we begin with the implicit assumption that the data is correct.
However, this is the real world and data cannot be assumed correct. Some investigative work is required. Making a chart without first validating the integrity of the data is a purely hedonic performance, an expression of scientism – the ritualistic display of competence which has become the backbone of those institutions whose members don’t know what they are talking about but are very good at not giving a damn about it.
The root problem, is that when people do not understand a field well enough to be able to discern competence, they have to resort to proxies to evaluate it: credentials, jargon, charts, and the ability to perform similar superficial rituals.
Unfortunately school has given up since long teaching any skill whose related exam cannot be passed through sheer imitation. Scientism is the dangerous result.
“Let’s use the data we have”
No. Taking decisions of bad data is worse than taking decision on no data.
If you have no data, at least you either take the conservative choice or you realize the need to go get some data which can be relied upon. Instead, taking decision on bad data is dangerous.
We already took enough dangerous choices in January, in February and in March. Let’s play it safe in April. Let’s go dig some good data, and let’s not use charts and models until we have good data.
You might be interested in my other newsletter, unrelated to the pandemic.
Edit: compared to the version sent via email, I added one more link to the first paragraph and used the 20 days from symptoms onset to result in official statistics data point from the study and the 10 days from test results to incorporation in official statistics to come up, by subtraction, to the 10 days from symptoms onset to test results.