At the end of November last year some self-proclaimed experts on PCR compiled a ‘retraction paper’ with which they demand a retraction of the first scientific article which describes a PCR protocol for detecting SARS-CoV-2, known as the Corman-Drosten test. I’ve written about this in my previous post. Some members of this ‘consortium’, as they refer to themselves, have now teamed up with two others, Wouter Aukema and Simon Goddek, and written an article that attacks the PCR tests from another angle. The article titled Bayes Lines Tool (BLT) – A SQL-script for analyzing diagnostic test results with an application to SARS-CoV-2-testing is available as a pre-print and they’ve even created a website dedicated to this project.
The article deals with the daily reports on tests and positive cases that are published in many countries during the corona pandemic. As with any test, you have to consider the occurrence of false positives and false negatives, and the prevalence of the issue you’re testing for. Those figures can’t be determined directly from the outcome of the tests, you need information from other sources for that.
Bayes Lines Tool
The authors of this article claim, however, that their tool, a quite simple algorithm, can back-solve “disease prevalence, test sensitivity, test specificity, and, therefore, true positive, false positive, true negative and false negative numbers, from official test outcome reports.” With the results of their calculations, they strongly suggest that the number of false positives might be actually huge and even amount to the majority of the reported positive cases. If that would be the case it would, of course, cast serious doubt on testing on a large scale, as we do now.
However, already quite some people have pointed out several obvious flaws in their approach on Twitter. Some criticism has found its way to PubPeer as well. I will briefly explain what I see as the most fundamental flaw and that has to do with the mathematics behind it.
What Aukema and his co-authors try to do is find solutions for this ‘confusion matrix’, as they call it, when only the numbers in the last row are given.
|Actual infection status||Test result positive||Test result negative|
|Infected||True positive (TP)||False negative (FN)|
|Not infected||False positive (FP)||True negative (TN)|
|Number of tests reported||Reported positive cases (TP+FP)|
The algorithm is a quite unsophisticated search for combinations of specificity, sensitivity and prevalence that when used for filling in the expected number of TP, FP, FN and TN would match the number of reported positives (TP+FP).
For instance, their somewhat simplified version of the tool in Excel finds two ‘solutions’ for the reported figures of 242 positive cases from 27,000 tests from a specific subset in The Netherlands. This is the default example you see when you download the file.
This might look meaningful or impressive when looked at superficially (and maybe shocking when you consider the number of false positives). The numbers are correct in the sense that they add up to the reported numbers. But the solutions found in this way are not really better than many other combinations that are disregarded by the selection procedure.
Also, combinations that would give an expected value for the number of positive tests that is a little off the observed value can still be the accurate values (prevalence, specificity and sensitivity in the real world) because we’re dealing with a stochastic process. If you would test the same number of people sampled from this population on the same day (i.e. with the same prevalence), you might find another number of positive cases just by chance.
To make that more concrete, let’s go back to the example. Let’s assume for a second that the Bayes Lines Tool gives the accurate figure for specificity here, so 99.5%. That doesn’t look too much off from actual estimates, but let’s see if the significance can actually be 99.95% for the test used here. We keep sensitivity at 80% and assume prevalence at 1.05%. If you fill the confusion matrix and round the figures to whole numbers you’ll find this:
|Actual infection status||Test result positive||Test result negative|
|Infected||TP: 227||FN: 57|
|Not infected||FP: 13||TN: 26.073|
|Tests: 27.000||Positives: 240|
Is there any argument to dismiss this chosen set of values for prevalence, specificity and sensitivity? You could point out that the total number of positives in this ‘confusion matrix’ doesn’t match the actual reported number (242), but is that a valid reason to prefer the combinations that are found by the tool? No. The numbers we calculated for this matrix are expected values of stochastic variables. The total number of positives you will find when doing this repeatedly with the chosen values for prevalence, specificity an sensitivity will vary, only the average will tend to be closer to 240 the more often you repeat the process. 242 as an actual result in one such trial is entirely feasible.
This Bayes Lines Tool just finds a finite number of solutions from the infinite number that could explain the observed values. The only ‘special’ property the set that it finds has, is that the expected values in the confusion matrices are an (almost) exact match to the reported values.
Expected values vs observed values
I also argued that fitting the expected value to the observed value doesn’t really make sense in general. I gave the ordinary die as an example. When you roll it once, the expected number of pips is 3.5, but there is no side with that number of pips. The true expected number of positive tests (depending on the actual specificity, sensitivity and prevalence ) might not even be a whole number and therefore not show up as the reported number of positive cases.
I pointed this issue out to Aukema (who I think is the one who came up with the idea for this tool and programmed it) with this tweet:
No counter arguments?
Aukema reacted in a way that suggested that he was willing to discuss this in a fair and open matter, but he has failed so far (6 days later) to even start to comment on my main argument, or on the arguments of the other critics. On PubPeer he stated this about my arguments:
As a co-author of this paper, I therefore choose to not respond in defence to the subjective, one-sided claims pasted in above mentioned comment-entry.Aukema on PubPeer (link to full comment)
Meanwhile, two of his co-authors, Simon Goddek and Bobby Malhotra, terrorize critics like me on Twitter with insults and have failed to give any substantial arguments that deal with the flaws that we have identified. Aukema has yet to comment on their indecent behaviour.
It seems quite clear that the consortium has not yet found any valid arguments that would save their article from the criticism that I and others have given. Some of them even try to dismiss all criticism as part of a heavy smear campaign’.
As I assumed the consortium would most likely try to frame this post as part of such a campaign too, I made a ‘Bunker scene meme‘ to have some fun at their expense. The video was embedded here, but I’ve removed it now as it has served its purpose.