Exactly a year ago Tilburg University announced in a press release (archived copy) that professor Diederik Stapel had been suspended after strong suspicions of scientific fraud had been brought forward. I was really surprised when I read about this later that day. Wasn’t he the same guy who had collaborated in the research which I had been analyzing a couple of days before? That research by Stapel, professor Marcel Zeelenberg (Tilburg University ) and professor Roos Vonk (Radboud University Nijmegen) had shown that people who were exposed to images of meat act more selfish and show more anti-social behaviour.
About two weeks before (Aug 25th) another press release from my own university (not available online anymore, but here is a copy) had brought this peculiar result, which immediately attracted a lot of attention on blogs, Twitter and almost all newspapers. Vonk has a record of being an activist for animal rights and has been very critical of the meat consumption in our society, also because of the environmental problems that come with it. So many critics of Vonk’s views on eating meat immediately saw this study (and how it was presented in the press release) as an abuse of science for her personal agenda.
There are still a lot of people who think that the Stapel fraud is directly connected with this study on the psychology of (eating) meat and that it actually led to the discovery of his massive fraud. But that’s not the case. Stapel probably faked the data for this one as well, but it was just coincidental that the two issues came up in the same period.
I became a bit more interested when I read about the small number of people (32) who had participated in some of the experiments and the results which were just marginally significant. Could it be that the researchers had used the wrong statistical tests, I wondered? That was something I had come across in other studies with small samples. I also noticed something strange in the numbers I picked up from a blog by science journalist Arno van ‘t Hoog: “EGOÏSTISCHE VLEESETERS (2)“. The percentages mentioned for the two groups, 44 and 15 percent, couldn’t be right when only 32 people had been involved. As Van ‘t Hoog had gotten a pdf from the researchers with the results, I asked him if I could have a look just out of curiosity. Note that there was no published article yet, not even a submitted version. This was exactly one of the main concerns of Van ‘t Hoog: why did they make the results public so early without going through the normal process of peer review and publication?
The study on the psychology of eating meat
Van ‘t Hoog sent me the pdf, which Vonk had sent to several science journalists and I started reading. In the first experiment, 32 participants were split up into two groups. One group (‘Crisis’) was given a newspaper article about something bad, the economic crisis. The control group (‘Neutraal’) got a neutral article to read, something about the forming of waterdrops. After this ‘priming’ the participants were first given a test to score them on insecurity and need for structure.
After that, they were given a menu from which they had to make a choice (see image on the right). Three choices: one meat dish, one with a zucchini frittata and one dish with fish. The result from this experiment was that the ‘crisis’-group chose significantly more often the meat dish than the control group. The ‘priming’ had had the expected effect: the ‘crisis’-group was more insecure and felt a bigger need for structure.
The authors concluded: people who are (made) insecure chose meat more often, not because they like it or think it’s a healthy choice. As Vonk speculated further in the press release: “Like riding a Hummer, meat gives a boost to your status and ego“. Wow!
The other experiments were similar in setup and size, and had the same lack of substance. The study design is obviously silly if you take a closer look. How on earth could three professors think that this was a good experiment? They seem to think that if a person chooses the first option from the menu, that his choice is about ‘meat’. But maybe he just doesn’t like baked tomatoes and sesame seeds. A meat lover might choose the fish dish, only because he knows that in cheap restaurants they always ruin your steak.
A far better way of testing the hypothesis would be to offer a real menu with several meat, vegetarian and fish dishes. And afterwards, score the choices made in categories ‘meat’ and ‘non-meat’. In the way the experiment was said to have taken place, you could even argue that it just showed a difference in preference for the position on the menu. ‘Insecure people pick the first option given‘ sounds as good to me as an explanation for the result as the reason the authors gave. But let’s get back to the figures they presented.
Some number crunching
The table from the first experiment is here on the left. In Dutch, but I guess you can make out what it is about. It doesn’t state the number of people in each group, but it seems logical that it was split 16-16. Then 44 percent (third row) comes from 7 participants who chose the meat dish. But the 15 percent of the control group is a problem: if there would have been two people who choose meat, the percentage should be 13. And if the actual number was three, the percentage would be 19! Could be a typo, I thought, because even if you try to find a better match with unbalanced groups, you can never get the percentages 44 and 15 both at the same time.
To test my initial hunch that something went wrong with statistical tests, I tried two as the actual number of meat choosers in the control group, because it’s the more extreme difference between the groups. And indeed, if you blindly use a Chi-squared test you’ll find the difference is significant (p=0.0247). But that test is not appropriate with these conditions, low figures per cell and some even lower than 10. In this case, a better test is the Fishers exact test. This gives p=0.0567 and therefore does not point to a significant difference.
The same sort of problems I encountered in the figures of the second experiment. After doing the calculations I also noticed that it seemed rather odd that only 15 percent of the control group had chosen the meat. Why did they conclude that the ‘crisis’- group was influenced and not the ‘neutral’-group? Maybe reading about waterdrops makes you chose fish!
A lot more of these observations I put down in a review which I sent to Van ‘t Hoog on September 1st. We agreed that it looked like a mess, but we decided to let it rest for the time being and wait until the professors would present this study on a congress or in an article. Jan van Rongen had also noticed the strange percentages and wrote about it on Foodlog on September 6th: ‘Fairy tales: how tofu makes you forget things and how meats makes you more selfish‘ (Dutch).
September 7th: breaking news
A day later came the press release about Stapel’s fraud. He had probably faked the data of many of the articles he was (co)author of. Then it became obvious to me that the errors in the percentages might not be the result of sloppy work, but of pure imagination! I mentioned this on Twitter. Van Rongen noticed that and pointed me to his own findings on Foodlog.
We discussed our findings in more detail via e-mail and found out that we had been on the same track. Probably we were the only people who had really given a close look to the figures themselves before the announcement of Stapel’s fraud. Also, our first guess had been the same: the possibility of using the wrong statistical tests.
On the Internet and in some printed press there was (and still is) some confusion about which study led to the discovery. It wasn’t the study on meat of course, because that had not been published yet. It was three young collaborators of Stapel who had come forward to the Rector Magnificus of Tilburg University after several months of collecting evidence for their accusation against Stapel. They first informed Zeelenberg about their findings, coincidentally just a day before the press release on the meat study.
After the press release on Stapel’s fraud, Vonk had to conclude that he most probably also fabricated the data for ‘the meat study’. She offered her apologies about the study and for some harsh replies (via email and Twitter) she had given to people who had been critical of the study. A blog by Martin Enserink on Science Insider, gave a good summary of the events on that day: Dutch University Sacks Social Psychologist Over Faked Data. (while writing this blog I notice that my name was mentioned in the comments 😉 )
Now things had changed so quickly, Van ‘t Hoog published a new blog on the meat study (next day, September 8th) which presented my findings. He also expressed his fear that questions about the methodology of this kind of studies were to stay out of sight for a while, because fraud is so much more interesting to talk about. The discussion which started focused indeed on the data fabrication by Stapel, how many articles were involved and how he could have done this without getting caught earlier. And also whether Vonk had been responsible for the fraud as well.
A commission consisting of some heavyweight scientists has been installed to look at all Stapel’s publications and already a lot of those articles have been retracted. Professor Vonk also got investigated by her employer. The conclusion was that she had not been involved in fraudulent matters, but she was reprimanded for acting unprofessionally. The university stated: “She drew hasty conclusions on the basis of data that she herself had neither collected nor checked. These conclusions were published prematurely in a university press release. The design of the ‘meat study’ did not meet academic standards. Furthermore, the Board is of the opinion that Professor Vonk, given her role in the debate on the cattle industry (bio industry), should have been extra critical of her own approach.” But the report of this investigation has not been made public.
A lot has been written on Stapel and Vonk in the year that has passed, but the third collaborator, Zeelenberg has managed to stay away from public scrutiny. And I am not aware of any interview he might have given in the year that has passed now.
And what’s wrong with organic food?
In main stream media focus stayed on the fraud itself, but I was well aware that among Social Psychologists a discussion had started about the methodology of Stapel and others in studies with a similar design. That’s fine, I thought, something good can come from this mess.
A couple of months ago, however, I stumbled upon a blog in which the result of a study on the effect of buying organic foods was discussed: ‘Does organic food turn people into jerks?‘. When I read about this research I just thought: ‘this is the same shit over again, how can it be that someone gets such a study published after the discussion about Stapel’s methods?’ The study uses a similar design and same method of ‘priming’, which I think is completely flawed. With that I mean that the researchers assume things about the mental state of ‘primed’ individuals, which they can not convincingly relate to actions or choices made by these participants thereafter.
The researcher, Kendall Eskine (assistant professor, Loyola University, New Orleans) divided 60 people into three groups. One group was shown pictures of clearly labelled organic food, like in the first picture. The second group was shown comfort foods, e.g. ice cream and cookies. The people in the last group had to function as controls and were shown ‘neutral food’ (that is clearly non-organic, non-comfort). After viewing the pictures and being ‘primed’, each person was then asked to answer some questions on moral issues. And surprise, surprise: “We found that the organic people judged much harder compared to the control or comfort food groups,” says Eskine. “On a scale of 1 to 7, the organic people were like 5.5 while the controls were about a 5 and the comfort food people were like a 4.89.”
Like in the ‘meat study’, you can’t be sure that the mental state of test persons is about ‘organic food’, ‘comfort food’ or ‘neutral food’. On the contrary, I think it’s very unlikely. The choices offered have too many variables on which they differ and on which a person can be ‘primed’. A little bit better would be to limit these variables and present more similar pictures like:
But even then, I think it is rather silly to do this research with such small groups and scoring with Likert scales. For some moments I thought about the possibility of this being a brilliant parody on the study by Stapel, Vonk and Zeelenburg. Something like the Sokal affair. But looking at the themes of other articles by the same author, this is probably not the case. Just more bad science.
Stapel wrote a book about his fraud and the period in which it came to the light. The book has been translated into English and can be downloaded from the website of Nick Brown: Faking Science: A True Story of Academic Fraud