At the end of November last year some self-proclaimed experts on PCR compiled a ‘retraction paper’ with which they demand a retraction of the first scientific article which describes a PCR protocol for detecting SARS-CoV-2, known as the Corman-Drosten test. I’ve written about this in my previous post. Some members of this ‘consortium’, as they refer to themselves, have now teamed up with two others, Wouter Aukema and Simon Goddek, and written an article that attacks the PCR tests from another angle. The article titled Bayes Lines Tool (BLT) – A SQL-script for analyzing diagnostic test results with an application to SARS-CoV-2-testing is available as a pre-print and they’ve even created a website dedicated to this project.
The article deals with the daily reports on tests and positive cases that are published in many countries during the corona pandemic. As with any test, you have to consider the occurrence of false positives and false negatives, and the prevalence of the issue you’re testing for. Those figures can’t be determined directly from the outcome of the tests, you need information from other sources for that.
Bayes Lines Tool
The authors of this article claim, however, that their tool, a quite simple algorithm, can back-solve “disease prevalence, test sensitivity, test specificity, and, therefore, true positive, false positive, true negative and false negative numbers, from official test outcome reports.” With the results of their calculations, they strongly suggest that the number of false positives might be actually huge and even amount to the majority of the reported positive cases. If that would be the case it would, of course, cast serious doubt on testing on a large scale, as we do now.
However, already quite some people have pointed out several obvious flaws in their approach on Twitter. Some criticism has found its way to PubPeer as well. I will briefly explain what I see as the most fundamental flaw and that has to do with the mathematics behind it.
What Aukema and his co-authors try to do is find solutions for this ‘confusion matrix’, as they call it, when only the numbers in the last row are given.
|Actual infection status||Test result positive||Test result negative|
|Infected||True positive (TP)||False negative (FN)|
|Not infected||False positive (FP)||True negative (TN)|
|Number of tests reported||Reported positive cases (TP+FP)|
The algorithm is a quite unsophisticated search for combinations of specificity, sensitivity and prevalence that when used for filling in the expected number of TP, FP, FN and TN would match the number of reported positives (TP+FP).
For instance, their somewhat simplified version of the tool in Excel finds two ‘solutions’ for the reported figures of 242 positive cases from 27,000 tests from a specific subset in The Netherlands. This is the default example you see when you download the file.
This might look meaningful or impressive when looked at superficially (and maybe shocking when you consider the number of false positives). The numbers are correct in the sense that they add up to the reported numbers. But the solutions found in this way are not really better than many other combinations that are disregarded by the selection procedure.
Also, combinations that would give an expected value for the number of positive tests that is a little off the observed value can still be the accurate values (prevalence, specificity and sensitivity in the real world) because we’re dealing with a stochastic process. If you would test the same number of people sampled from this population on the same day (i.e. with the same prevalence), you might find another number of positive cases just by chance.
To make that more concrete, let’s go back to the example. Let’s assume for a second that the Bayes Lines Tool gives the accurate figure for specificity here, so 99.5%. That doesn’t look too much off from actual estimates, but let’s see if the significance can actually be 99.95% for the test used here. We keep sensitivity at 80% and assume prevalence at 1.05%. If you fill the confusion matrix and round the figures to whole numbers you’ll find this:
|Actual infection status||Test result positive||Test result negative|
|Infected||TP: 227||FN: 57|
|Not infected||FP: 13||TN: 26.073|
|Tests: 27.000||Positives: 240|
Is there any argument to dismiss this chosen set of values for prevalence, specificity and sensitivity? You could point out that the total number of positives in this ‘confusion matrix’ doesn’t match the actual reported number (242), but is that a valid reason to prefer the combinations that are found by the tool? No. The numbers we calculated for this matrix are expected values of stochastic variables. The total number of positives you will find when doing this repeatedly with the chosen values for prevalence, specificity an sensitivity will vary, only the average will tend to be closer to 240 the more often you repeat the process. 242 as an actual result in one such trial is entirely feasible.
This Bayes Lines Tool just finds a finite number of solutions from the infinite number that could explain the observed values. The only ‘special’ property the set that it finds has, is that the expected values in the confusion matrices are an (almost) exact match to the reported values.
Expected values vs observed values
I also argued that fitting the expected value to the observed value doesn’t really make sense in general. I gave the ordinary die as an example. When you roll it once, the expected number of pips is 3.5, but there is no side with that number of pips. The true expected number of positive tests (depending on the actual specificity, sensitivity and prevalence ) might not even be a whole number and therefore not show up as the reported number of positive cases.
I pointed this issue out to Aukema (who I think is the one who came up with the idea for this tool and programmed it) with this tweet:
No counter arguments?
Aukema reacted in a way that suggested that he was willing to discuss this in a fair and open matter, but he has failed so far (6 days later) to even start to comment on my main argument, or on the arguments of the other critics. On PubPeer he stated this about my arguments:
As a co-author of this paper, I therefore choose to not respond in defence to the subjective, one-sided claims pasted in above mentioned comment-entry.Aukema on PubPeer (link to full comment)
Meanwhile, two of his co-authors, Simon Goddek and Bobby Malhotra, terrorize critics like me on Twitter with insults and have failed to give any substantial arguments that deal with the flaws that we have identified. Aukema has yet to comment on their indecent behaviour.
It seems quite clear that the consortium has not yet found any valid arguments that would save their article from the criticism that I and others have given. Some of them even try to dismiss all criticism as part of a heavy smear campaign’.
As I assumed the consortium would most likely try to frame this post as part of such a campaign too, I made a ‘Bunker scene meme‘ to have some fun at their expense. The video was embedded here, but I’ve removed it now as it has served its purpose.
11 thoughts to “Bayes Lines Tool is just another flawed attempt by the consortium to discredit PCR tests”
The Twitter account @goddeksineal does not exist anymore. The Internet Archive has an archived copy of the url in my comment of 8 July 2021. See also https://twitter.com/pjvanerp/status/1413106390508376065 (in Dutch).
On 7 July 2021, Twitter has suspended the main Twitter account of Simon Goddek, one of the authors of “Bayes Lines Tool (BLT): a SQL-script for analyzing diagnostic test results with an application to SARS-CoV-2-testing”. Visitors of the account @goddeketal get right now the message: “Account suspended. Twitter suspends accounts which violate the Twitter Rules.” See https://twitter.com/goddeksineal/status/1412873696029650946 for some backgrounds.
Wouter Aukema and Rainer Klement, two of the authors of “Bayes Lines Tool (BLT): a SQL-script for analyzing diagnostic test results with an application to SARS-CoV-2-testing”, have published together with pseudoscientist Harald Walach a new article in the MDPI journal ‘Vaccines’ (“The Safety of COVID-19 Vaccinations—We Should Rethink the Policy”).
This new article was published on 24 June 2021. It was retracted on 2 July 2021 because it “contained several errors that fundamentally affect the interpretation of the findings.” The retration note states that Wouter Aukema and Rainer Klement (and also Harald Walach) did not agree with this retraction.
See https://www.pepijnvanerp.nl/2021/06/someone-fucked-up-badly-at-mdpi-vaccines/ for backgrounds.
This article was published on 10 May 2021 at F1000, see https://f1000research.com/articles/10-369 It has thus taken a period of over three and a half months pass the “rapid initial check by the in-house editorial team”.
This article was posted on 23 January 2021 as preprint at Zenodo. The authors state at Zenodo that they had submitted (at that day) this preprint to F1000. This preprint is until now not listed at https://f1000research.com/browse/articles Does this imply that this preprint did not pass the “rapid initial check by the in-house editorial team” of F1000?
Copy/pasted from https://f1000research.com/about : “Publication & Data Deposition. Once the authors have finalised the manuscript, the article is published within a week, enabling immediate viewing and citation. (…) Article submissions to F1000Research undergo a rapid initial check by the in-house editorial team before being published with the status ‘Awaiting Peer Review’. There is no Editor (or Editor-in-Chief) to make a decision on whether to accept or reject the article, or to oversee the peer-review process. (…) If a submission fails the initial checks it will be returned to the authors to address the issues, and if they are not resolved satisfactorily the article will not be accepted.”
See https://web.archive.org/web/20210212005541/https://zenodo.org/record/4459271 for an archived copy of this preprint (archived on 12 February 2021).
Anyone any idea?
This preprint was submitted in the second half of Januari 2021 to F1000 Research. There are towards my opinion issues in regard to inaccurate information about affiliation(s) of the authors and issues about undisclosed competing interests of the authors. Below a preliminary overview.
The authors state in their preprint: “8. Competing Interests. All authors declare no competing interest”.
They have listed the following information about their affiliation / backgrounds:
“Wouter Aukema*, Independent Data and Pattern Scientist, Brinkenbergweg 1, 7351BD, Hoenderloo, The Netherlands; Email: [redacted] Phone: [redacted]
Ulrike Kämmerer, Department OB/Gyn, University Hospital of Würzburg, Josef-Schneider-Str.4, D-97080 Würzburg, Germany; Email: [redacted] ORCID ID: https://orcid.org/0000-0002-2311-6984
Pieter Borger, The Independent Research Initiative on Information & Origins, 79540 Loerrach, Germany; Email: [redacted]
Simon Goddek, Independent Scientist, Elias Beeckmanlaan 242, 6711 VS, Ede, The Netherlands; Email: [redacted]
Bobby Rajesh Malhotra, Department for Digital Arts, University for Applied Arts Vienna, Expositur Hintere Zollamtsstraße 17, 1030, Vienna, Austria; Email: [redacted]
Kevin McKernan, Chief Scientific Officer, Medicinal Genomics, 100 Cummings Center, 406L, Beverly MA 01915, USA Email: [redacted] ORCID ID: https://orcid.org/0000-0002-3908-1122
Rainer J. Klement*, Department of Radiotherapy and Radiation Oncology, Leopoldina Hospital Schweinfurt, Robert-Koch-Straße 10, 97422, Schweinfurt, Germany; Email: [redactedd] Phone: [redacted] ORCID ID: https://orcid.org/0000-0003-1401-4270
* Corresponding authors”
It is stated at https://f1000research.com/for-authors/publish-your-research : “Your manuscript includes full author and affiliation information, and a conflict of interest statement” and it is stated at https://f1000research.com/about/policies#compint :
“A competing interest may be of non-financial or financial nature. Examples of competing interests include (but are not limited to):
individuals receiving funding, salary or other forms of payment from an organization, or holding stocks or shares from a company, that might benefit (or lose) financially from the publication of the findings;
individuals or their funding organization or employer holding (or applying for) related patents;
official affiliations and memberships with interest groups relating to the content of the publication;
political, religious, or ideological competing interests.”
Several of the authors are covid denialists (in various forms) and are holding very strong beliefs about this topic. They use Twitter and other outlets to communicate about this topic. This information is not listed in the preprint. The author instructions reveal towards my opinion that such information needs to be declared.
Several of the authors are member of a so-called ‘INTERNATIONAL CONSORTIUM OF SCIENTISTS IN LIFE SCIENCES (ICSLS)”. The author instructions reveal towards my opinion that such information needs to be declared.
Pieter Borger is a young-earth creationist with very strong beliefs about this topic. See above. Pieter Borger has not listed that he is affiliated to Wort und Wissen. See https://veranstaltungen.wort-und-wissen.org/referenten/borger/ This it towards my opinion not allowed.
Simon Goddek has very strong beliefs about the use of Vitamine D in regard to the prevention of getting covid-19. Simon Goddek is owner of a commercial company who is selling vitamine supplements. Simon Goddek is argueing that the so-called ‘Leaky Gut syndrom’ exists. Such information needs to be declared, at least towards my opinion. Does the information “Independent Scientist, Elias Beeckmanlaan 242, 6711 VS, Ede, The Netherlands” imply that Simon Goddek is not (anymore) affiliated to WUR? Simon Goddek states at the moment at https://www.goddek.com/ : “As both an ecopreneur and a university researcher I have dedicated myself (…)”. Which university?
A recent interview with Bobby Rajesh Malhotra at https://ayavela.medium.com/we-are-in-the-middle-of-an-information-war-9ae376cfbcf9 seems to indicate that this author is not (anymore) affiliated to “Department for Digital Arts, University for Applied Arts Vienna”.
Co-author Pieter Borger wrote on 27 January 2021 at https://twitter.com/BorgerPieter/status/1354441617135071234
“We need real science and real scientists to solve problems and to inform the public in an honest way. We certainly do not need pseudoscience, scientism and dogmas to spread fear and confusion. Scientists, politicians and journalists take your responsibilities!”
Does this mean that Pieter Borger is very willing to rebut over here and at Pubpeer the remarks about this preprint?
See below for a slightly redacted comment which I had posted at https://kloptdatwel.nl/2020/12/02/retraction-paper-pcr/comment-page-4/#comment-88274
Volgens mij bevat de inleiding pseudowetenschappelijke onzin. Hieronder een aantal aanwijzingen daarvoor:
(1): de zin “The few studies aiming to estimate sensitivity and specificity of SARS-CoV-2 RT-qPCR tests have reported sensitivities and specificities in the ranges ≳30% and ≳80%, respectively -therefore, the communicated data seldom can offer precise distinctions (14)” eindigt niet met een overzicht van deze “few studies”, maar met een wartalig theoretisch artikel (bron 14) van co-auteur Rainer Johannes Klement. Het gaat om “The epistemology of a positive SARS-CoV-2 test”. De eerste zin van de samenvatting ervan luidt: “We investigate the epistemological consequences of a positive polymerase chain reaction SARS-CoV test for two relevant hypotheses: (i) V is the hypothesis that an individual has been infected with SARS-CoV-2; (ii) C is the hypothesis that SARS-CoV-2 is the cause of flu-like symptoms in a given patient”. Het artikel staat in het tijdschrift “Acta Biotheoretica”, zie https://link.springer.com/article/10.1007%2Fs10441-020-09393-w (open access).
(2): de woorden COVID-19, cases en prevalence estimates staan zonder verdere uitleg tussen aanhalingstekens (dus als ‘COVID-19’ en ‘cases’ en ‘prevalence estimates’). De auteurs geven hiermee met zoveel woorden aan dat ze wat anders onder deze termen verstaan dan anderen. Uitleg / bronnen ontbreken.
(3): de auteurs ontkennen volgens mij in de zin “After sporadic SARS-CoV-2 positive cases in January (5,6), from the end of February 2020 worldwide cases of the SARS-CoV-2-associated disease ‘COVID-19’ began to accumulate, causing policymakers in many countries to introduce countermeasures” dat er een oorzakelijk verband is tussen de ziekte COVID-19 (een ziekte die men kennelijk ontkent, gezien de aanhalingstekens om het woord COVID-19) en een positieve testuitslag. Uitleg/bronnen ontbreken.
(4): het artikel is een vervolg op Borger et al. (2020), maar nergens in de inleiding wordt hiernaar verwezen. Ook in de rest van de tekst van de preprint wordt niet verwezen naar Borger et al. (2020) en de publicatie staat niet in de literatuurlijst. Volgens mij is dat een volstrekt onwetenschappelijke benadering. Zie https://www.pepijnvanerp.nl/2020/12/some-considerations-on-the-retraction-paper-for-the-corman-drosten-pcr-test/ voor achtergronden over Borger et al. (2020).
Er wordt al met al van alles en nog wat gesteld dan wel beweerd dan wel gesuggereerd in de inleiding, maar daar blijft het dan vaak bij.
Daarnaast bevat de literatuurlijst dezelfde slordigheden / inconsequenties van de literatuurlijst van Borger et al. (2020). Waarom lukt het deze mensen niet een fatsoenlijke literatuurlijst op te stellen?
Het paper is alles behalve seminal/baanbrekend. Het lijkt het meest op het oplossen van 1+x = 2 door alle waardes van x tussen -100 en 100 te proberen. En wat moet je daarna met het antwoord x=1? Wat moeten we met de tabellen met waardes voor ‘x’ die de autuers hier publiceren?
De auteurs stellen terecht dat meestal het totale aantal tests #N en de som van de true positives TP en false positives FP worden gepubliceerd. TP en FP zijn gegeven door (eq. 2 en 3 uit paper)
= (|) × # = × × #
= (¬|) × # = (1 − ) × (1 − ) × #
De som TP+FP is dus [ × + (1 − ) × (1 − ) ] × # . Dit is ook te schrijven als
(TP+FP)/#test = × + (1 − ) × (1 − ),
waarbij (TP+FP)/#test de schijnbare positieve fractie tests is. Voor een gegeven fractie positieve tests kan je nu een willekeurige sensitivity en specificity kiezen en direct uitrekenen wat de prevalence dan is. Of je kunt ‘alle’ variaties van sensitivity, specificity, en prevalance gaan testen en willekeurige (afhankelijk van stapgrootte) matches vinden. Dat laatste doen de auteurs in een variant op het 1+x=2 voorbeeld van hierboven.
Waarom doen ze het op deze manier? Waarom doen ze het uberhaupt? Geen idee. Ergens zou dit het begin van een ruwe gevoeligheidsanalyse kunnen zijn. Maar dan helemaal aan het begin. En er zijn daar veel betere methodes voor.
I still don’t agree 100% with your arguments ;-).
I’ve created a working SQL script (https://pubpeer.com/publications/4A210FB2DD2C4A60A7F9896A7EBCC1#10) which uses prevalence, “expected” false_negatives and “expected” false_positives (≤ positive test!) as confusion matrix, which shows really all “valid” matches. The prevalence is not the prevalence in the population but the prevalence in the sample (which is some multiple of 1/no. of test).
With 27,000 tests and 242 positive results (the “Lansingerland – GGD campaign” with a prevalence 0 – 1000/27000; 0 – 242 max “expected” false_positives and false_negatives) I get 29,525 matches in ~59.1 million permutations. In the example I didn’t limit spec/sens!
(with 1:id 2:tests 3:pos_tests 4:prev (rounded) 5:has_disease 6:hasnot_disease 7:true_pos 8:true_neg 9:false_pos 10:spec (rounded) 11:false_neg 12:sens (rounded))
and ending with
Each match is a valid outcome of the “Lansingerland – GGD” campaign in the limits determined by .
IMHO the “stochasticity” is reflected by the number of valid matches per has_disease cases (or true_postives etc. cases):
matches per has_disease cases:
0 x 120
1 x 121
3 x 122
5 x 123
241 x 241
242 x 242
241 X 243
5 x 361
3 x 362
1 x 363
0 x 364
Probably a Binomial distribution.
The only thing we disagree upon, I think, is whether the authors claim that the prevalence they find is the prevalence of the general population or just of the sample that took a test. Of course, it is clear that the ‘solutions’ found by the tool can only refer to the latter, but in my opinion, it’s quite clear (especially in the Discussion) that they strongly suggest that they find estimates for the general prevalence. That would only be true if the tested population is a random sample, which is obviously not the case (still very few asymptomatic cases are tested).
To add a little to the “stochasticity”: In the ‘retraction paper’ of which most authors of the Bayes Lines Tool article are mentioned as co-author, they write: “The definition of false positives is a negative sample, which initially scores positive, but which is negative after retesting with the same test.” So, even in their own view, the number of false positives can not be seen as ‘hard’.