I’m part of a small group charged with the task of finding a fun, creative way of teaching a class of 30 American 15-16 year olds the concept that when trying to work out if a treatment is effective you must be sure that the groups in which the treatment is being compared are similar—that, “like is being compared with like.” We are a public health doctor with an interest in health services research, a doctor interested in promoting evidence based medicine, an education teacher who was once a researcher, a university dean, an academic who uses games to achieve beneficial ends, and me. We have two hours, and we don’t find it easy.
We didn’t ever manage a fully worked out answer, but we came close with a programme that had five stages and might take a week to teach.
The first stage would be to present the students with a case—perhaps a video—of where strong claims are being made on the basis of a comparison of groups that were clearly dissimilar. We thought that outrage might motivate the students. Unfortunately, the only example we could think of from the real world was the observation from large numbers of women that hormone replacement therapy would stop women having heart attacks and strokes. Randomised trials eventually showed the opposite—because the women who took hormone replacement therapy outside of the trials exercised more, had healthier diets, were less likely to smoke, and were richer. We would try to get the students to think this through for themselves.
It didn’t seem to occur to us at the time—but certainly did afterwards—that this was a poor example for adolescents.
In the next stage we would divide the class in some way and set them a task where we knew that there would be difference in the outcome. I favoured dividing them by gender and running a hopping race, thinking that overall the males would do better than the females. The ex-teacher said that doing this with 15-16 year olds would result in chaos. Other suggestions were arm wrestling and having to sing a high note.
We also considered other ways of dividing the class and recognised that dividing students could create difficulties if we were to divide by sensitive issues like height, intelligence, or number of Facebook friends. We could have gone for handedness and asked everybody to draw a face with their right hand, but that would produce very unequal groups. If we divided by letter of their name we couldn’t be sure that the outcome would be different.
Our third stage would be to ask the students for other ways of dividing the groups. One possibility might be to create two groups that were each half male and female. Ideally, these would have different outcomes in a hopping race (or whatever outcome measure we chose), illustrating how dividing people by something you recognise (gender) wouldn’t solve the problem that the groups might differ in other ways that were recognisable (height, weight) and in ways that were unknown.
We’d then ask the students how they might achieve equal groups, and we hoped that somebody might suggest tossing a coin—random allocation. Ideally we would do this and the two groups would then perform the same—or at least a lot closer—in the hopping race.
We played around with the idea that we might inject into this “a treatment,” and we debated whether it might be something like giving one group red smarties—but we thought that it might be better to have a “real” treatment that would affect the outcome. I suggested a “negative treatment,” heavy boots, but we decided that it would complicate things to include a treatment.
The final stage would be to ask the class to dupe another class by showing with real data that red sweets made people better at maths. They might, for example, identify the level of ability of their class (or the other class), and then search for some difference between the best and the worst—perhaps in the first letter of their first or second names—and then give the red sweets to the best students identifying them by something other than ability.
This final stage was important because it would mean that they were having to use what they had learnt. The ex-teacher illustrated the importance by what I imagine to be a classic teacher story. “I taught my dog to speak,” says the teacher. “But she can’t speak,” says a friend. “ The teacher replies, “I taught her, I didn’t say she could speak.”
This final exercise could also be used for assessment, but there was worry in the group that we might be teaching bad scientific practices; but our initial enthusiasm for the idea was that adolescents would be attracted by something naughty.
This is a process that happens in the real world of devising teaching programmes (and games), and it’s not easy. In the real world you do the best you can and then keep testing on students and revising—until you arrive at something that works.