Skip to main content

The Dunning–Kruger effect is misunderstood by many

‘Too stupid to see how stupid they are’ – The ‘Dunning-Kruger effect’ is used all over the place to rub in people’s ignorance. Does the effect exist – and if it does, how big is it really?

You come across them quite often: people who proclaim firm opinions on a subject they clearly know little about, who vigorously oppose everything that is scientifically known. Nonsensical anti-vaccination propaganda, warnings for health risks of mobile phones, or quasi explanations that are based on nonsense about quantum mechanics. Irritating as hell sometimes. How tempting, then, is it not to really show their incompetence by providing real, reliable knowledge? And in conclusion, to deal them a final blow by adding that this is yet another ‘typical case of Dunning-Kruger’? Add a picture from the internet and you’re done. It is clear that your conversation partner is on ‘Mount Stupid’!

The ‘Dunning-Kruger effect’ has not been around very long – about 20 years – but it is already a well-established concept. On the Internet, you come across it regularly on platforms where there are lively discussions. It has become a standard weapon in the discussion toolbox.

Experiments

The Dunning-Kruger effect was born in 1999, when Justin Kruger and David Dunning, connected to Cornell, published their article ‘Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments‘. Actually, we should speak of the Kruger-Dunning effect, because Kruger was the first author of the article and must have done most of the work as a PhD student with Dunning.

In their article, they describe four experiments conducted among psychology students. The students were given tests that measured their sense of humour, knowledge of grammar and logical reasoning. (In humour, they had to like thirty jokes as much as a panel of seven comedians; in logical reasoning, they had to answer questions like ‘which house does the electrician live in?) Each time, the students were asked how well they thought they had performed the tasks, and their expectations were compared to their actual scores.

Figure 1. Students with an actual score (green) in the lowest quarter overestimate their expected score (red), while the highest scoring students underestimate their ability.

Over and under

It has long been known that people tend to overestimate their competence. This leads to the lame joke that most people think they score better than average. However, Kruger and Dunning’s experiments revealed something else: not only did the people with the highest scores appear to suffer rather from self-underestimation, but the self-overestimation of the less competent subjects was also relatively more prominent (figure 1).

As an aside, it will immediately come to your attention that these graphs bear little resemblance to the pictures that people use to beat each other up on the Internet. There is no ‘Mount Stupid’, nor a ‘Valley of Despair’.

How exactly are these graphs made? Kruger and Dunning first divided the participants into four groups based on their actual test scores and averaged those scores. These four numbers lie on the green line: the quarter with the lowest scores on the left, the quarter of students with the highest scores on the right. For each group, they then calculated the average of the expected score given by the students. These points are connected with the red lines.

If there were no self overestimation and self underestimation and everyone on average predicted their test score correctly, all the points would lie neatly on the diagonal.

Lifted leg

For the points on the green line this is also true, the small deviations are due to the low numbers of participants in the surveys (for the graphs above, this concerns 65, 45 and 84 students respectively).

But there is a good statistical reason why the red points do not lie neatly along the diagonal. This has to do with the fact that subjects who know that they score well or very well have little room left at the top of the scale on which they have to estimate their relative position. And for the worst performers, who know themselves quite well, there is less room to underestimate than to overestimate – if you suspect you will get a zero, it is difficult to give yourself an even lower mark. The form one might therefore expect is a horizontal cross, with the diagonal green line crossed by a somewhat more horizontal red line.

The pure Dunning-Kruger effect can only be seen if the left leg of the cross has been ‘swung up’ a little more in respect to that starting position.

Scène from Monty Python’s sketch ‘Upper-class Twit of the Year Race’ (1970) — the final obstacle: now all they have to do here to win the title is to shoot themselves … (photo | Alamy)

What causes it?

Kruger and Dunning based their article on a number of predictions. The first prediction was obviously the occurrence of the effect itself. They also stated that it was mainly a lack of ‘metacognitive’ skills (knowledge about knowledge) that explained the effect. Even if you give the participants from the lowest quarter feedback about their actual score or show them results from other participants, the effect occurs. They continue to see themselves, on average, as too high in the rankings, while more competent individuals, after such feedback, are better able to assess their actual position.

According to Kruger and Dunning, the lack of knowledge about the specific subject causes the ‘incompetent individuals’ to make mistakes, but the same lack of knowledge also causes them to be unable to properly assess how others are doing and how they score in relation to them: ‘the double burden of incompetence’.

Objections

The article has been criticised. The division into four groups is said to be somewhat unnecessary and artificial, or the incompetent individuals might not really believe that they score above average, or it is thought to be mostly noise. Follow-up studies by Dunning and others (Kruger has gone in other directions) have countered much of that criticism.

What is certain is that the original graphs summarise the underlying data very condensedly, which could give a misleading picture. In the journal Numeracy, Edward Nuhfer, a geologist who later became more active in Educational Science, published two articles in which, based on a large number of measurements, he tries to show what happens ‘behind the scenes’ of Dunning’s graphs. Good instruments for investigating the relationship between self-assessment and actual performance in a particular area are not readily available. Nuhfer and his co-authors use the Science Literacy Concept Inventory (SLCI), a questionnaire they developed that tests understanding of key concepts in science (‘Science can test certain kinds of hypotheses using controlled experiments’), and an accompanying test that measures self-assessment, the Knowledge Survey of that SLCI (KSSLCI for short), which measures self-understanding per question (‘I know this subject well enough to be tested on it’). It is a set-up that provides a more reliable self-assessment than the Kruger and Dunning ranking.

Figure 2. Scores on test questions of the SLCI compared with the estimation of own competence (KSSLCI).

If you plot the scores against each other, i.e. the knowledge against the self-knowledge, Figure 2 emerges. What is striking is the large spread in self-assessment among subjects who achieved equally high test scores – people who got eighty per cent of the questions right, for example, were between ten and one hundred per cent in self-assessment.

Figure 3. Scores of Figure 2 in quartiles à la Dunning and Kruger.

If you then make a summary of this à la Kruger and Dunning, you see the familiar cross reappearing (figure 3).

The absence of a raised red leg suggests that the Dunning-Kruger effect does not occur here, or hardly at all. Nor do we see a general overestimation of the self, in which everyone thinks they score above average. According to Nuhfer, his research shows that this does not occur at all as soon as your self-assessment instrument is well designed.

Five per cent

Nuhfer’s data contains data from first-year students to professors. Many first-year students achieve a very high score while their self-assessment seems to be in line with the knowledge one would expect at their educational level. In the case of the real experts, the professors, self-assessment and actual score are much closer to each other; there is less variation. The highest quarter based on score, therefore, contains the real experts but also a number of – coincidentally or otherwise – good scoring beginners. The average self-underestimation that you see in the Dunning-Kruger graphs for the highest quartile can also be explained by this. Based on his data, Nuhfer estimates that the group that scores poorly and has a far too optimistic perception, the real ‘unskilled and unaware of it’, accounts for only about five per cent of the total.

Chess players and examinees

Like much psychological research, Kruger and Dunning’s experiments were conducted among psychology students, so it is somewhat questionable whether the results found are generalisable. In a 2011 review article, Dunning lists studies in real-world settings, but not all of them are convincing.

Young Joon Park and Luís Santos-Pinto, for example, investigated whether the effect occurs in poker players and chess players if you ask them how high they will finish in a tournament. In poker, they found only a generalised self overestimation, but in chess, the predictions of the weaker players about their position in the final ranking deviated significantly more from reality than did those of the strong players. This seems to be a strong example of the Dunning-Kruger effect because chess players have an objective measure of their strength in the so-called Elo rating. But there is something fishy about this part of the research. The problem lies with the average players: there are far more of them than of the chess players who compete for the highest places. Average players are therefore much more likely to be significantly wrong in their estimation than the strongest players, for whom an error of estimation and the form of the day have much less effect on their final position. [see here for a more extensive review]

A better example cited by Dunning is a 2009 study into the self-assessment of people who were attempting to pass their driving test. Among Finnish test subjects, the researchers found a slight tendency to overestimate, but among a – much smaller – group of Dutch people, the effect was very strong: those who failed the test overestimated themselves more on average than those who passed. Half of those who failed the test overestimated their own vehicle control, compared to a third of those who passed.

The fact that Dunning and Kruger were awarded an Ig Nobel Prize (a wittily-meant prize for ridiculous but serious research) in 2000 for their article will have contributed to the awareness of the effect. There is also a certain irony in the fact that incompetents not only mess around but don’t seem to realise it themselves. In 2017, at the presentation of the Ig Nobels, the effect was also sung about in The Incompetence Opera – which, of course, also incorporated the Peter principle.

The Dunning-Kruger song from The Incompetence Opera. (YouTube)

Not so strong

There is enough reason to believe that a Dunning-Kruger effect exists. However, it is not nearly as strong as many seem to believe and an extreme mismatch between competence and self-image occurs in only a small group of people. Moreover, it is not so easy to demonstrate ‘outside the laboratory’, it can only be reliably measured in large groups, and it requires a subtle instrument for self-assessment.

In my opinion, you can only speak of a real Dunning-Kruger effect if participants basically have the same concept of truth. The participants who failed their driving test may not fully agree with how their driving skills were assessed, but they are not likely to think it is allowed to drive on at a red traffic light. With sense of humour, it is a bit more complicated.

In practice, it is easy to interpret things as a Dunning-Kruger effect. Disagreeing with established science may stem from a lack of understanding of the subject matter, but can also have an ideological basis. If anti-vaxxers in a study claim to know more about vaccinations than recognised experts, this does not seem to me to be a good example of the effect. In that case, something like motivational reasoning comes into play. If they claim to know better than virologists, it is more likely because they measure expertise on a different scale in the first place, and that they see experts as deliberate liars in the service of BigPharma and therefore do not accept them as independent experts. But I could be completely wrong – claiming to have a deep understanding of the Dunning-Kruger effect is, of course, asking for trouble.

References

Kruger J, Dunning D. Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessmentsJournal of Personality and Social Psychology 1999;44:247–296, PMID 10626367.

Dunning, D. Chapter five — the Dunning–Kruger effect: on being ignorant of one’s own ignoranceAdvances in Experimental Social Psychology 2011; 44:247–296.

Nuhfer E, Cogan C, Fleisher S, Gaze E and Wirth K. Random number simulations reveal how random noise affects the measurements and graphical portrayals of self-assessed competencyNumeracy 2016; 9:1.

Nuhfer E, Fleisher S, Cogan C, Wirth K, and Gaze E. Random number simulations reveal how random noise affects the measurements and graphical portrayals of self-assessed competencyNumeracy 2017;10:1.

Park YJ, Santos-Pinto L. Overconfidence in tournaments: evidence from the fieldTheory and Decision 2010;69:143–166.

Motta M, Callaghan T, Sylvester S. Knowing less but presuming more: Dunning-Kruger effects and the endorsement of anti-vaccine policy attitudesSocial Science & Medicine 2018;211:247–281, PMID 29966822.


Translated from the Dutch original ‘Te dom om te zien hoe dom ze zijn’ which was published in Skepter 33.1 (2020)

Did you enjoy this article? Then please consider to support my blog with a donation.

Leave a Reply

Your email address will not be published. Required fields are marked * Your comment might stay in the moderation queue for some time, especially if it is your first comment on this site. Usually all comments will be published, even if they express extreme disagreement with my writing, but I suggest that you find another place to leave rude and offensive comments. Also completely anonymous and non-English comments are not likely to pass moderation. Also read the Privacy Policy.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.