A Basic Lesson in Pseudoscientific Thinking
As a science teacher, a major focus in my courses is helping students design carefully controlled experiments that give them answers they can trust with a reasonable level of confidence. Regardless of how hard they work at controlled design, the data they generate are often messy (it is biology, of course) and require more than the typical descriptive statistics of mean, median, and standard deviation (among others) to make sense of their outcomes (As a result, I coauthored this free statistics guide for biology teachers, but that’s an entirely different topic). Yet, even with good training in scientific methodology, my students still struggle to recognize the difference between a good scientific study and a pseudoscientific one—a study that pretends to be good science but isn’t.
A study and resulting claim can qualify as pseudoscience in many ways. In its most simple form, pseudoscience is an experiment that fails to control variables other than the manipulated variable (the independent variable) that could influence the response variable (the dependent variable). Thus, a claim made about the result that does not consider the potential effects of the uncontrolled variables is a pseudoscientific claim.
This is an easy mistake to make.
For example, currently in my IB/AP Biology course, students are running experiments that respond to the prompt: Test the effect of an abiotic or biotic factor on seed germination or plant growth. A student of mine hypothesized that glucose is an essential nutrient in seed germination and predicted that increasing concentrations of glucose would increase germination rate. Had she run the experiment with her glucose treatments alongside a treatment of distilled water, she would likely have found the opposite to be true and she may have concluded that glucose does not promote germination but instead suppresses it, regardless of whether the presence of glucose during germination has any effect at all. This would be a pseudoscientific claim—the student failed to control for the fact that while increasing the concentration of a nutrient, she was also increasing the solute concentration of the solution in which her seeds were soaking. Her seeds were being exposed, not just to more glucose, but also to increasingly hypertonic solutions. Her design requires an additional treatment to rule out or exclude the solute concentration variable. She could, for example, run the same test on germination but use a non-nutritive solute like sodium chloride as the variable of increasing solute concentration and increasing hypertonicity.
Knowing to add the solute concentration variable to her seed germination experiment requires the student to employ exclusion reasoning. Exclusion reasoning for most students, and people in general, is a major obstacle and failing to include it is a logical flaw. In a 1993 paper titled, “Science as argument: Implications for teaching and learning scientific thinking,” Deanna Kuhn (Kuhn 1993) explains this problem:
“Exclusion is essential to effective scientific reasoning because it allows one to eliminate factors from consideration. Exclusion (inferring the absence of a causal effect) poses more of a challenge than inclusion (inferring the presence of a causal relationship) for several reasons. First, and most fundamentally, is the domination of affirmation over negation—the presence of something is more salient than its absence and, for this reason, both scientific and lay theories pertain more often to the presence than the absence of causal relations. Second, the belief that a factor is irrelevant often leads subjects to ignore it in their investigations. In so doing, they forego the possibility of encountering disconfirming evidence and, hence, ever revising this belief.”
Foregoing and even purposefully avoiding “the possibility of encountering disconfirming evidence” is doing pseudoscience.
The Future Leaders We are Training Must be Critical Thinkers
Many of the students in my IB/AP Biology course are interested in careers in the health sciences, and many of those students are interested in medical science. Pseudoscientific claims are common in studies involving humans and their health and wellbeing. Recently vox.com published an article titled, “The one chart you need to understand any health study,” and provided this helpful graphic:
All of the study types in the graphic above are at the risk of making pseudoscientific claims by leaving out necessary components of scientific methodology, but the pseudoscientific risk increases with decreasing strength of conclusions. Pseudoscientific claims are also nearly guaranteed if a study, regardless of where it lands on the list above, fails at exclusionary thinking.
Toward a Cure of Pseudoscientific Thinking by way of Acupuncture and CAM
In a course on the science of biology, studying about acupuncture in particular and complementary and alternative medicine (CAM) in general as evidence-based medical science has the potential to cure students of pseudoscientific thinking because both fail at exclusionary thinking.
The evidence is clear that acupuncture outperforms no treatment in many cases (Witt et al. 2006). No contest, really. The anecdotal stories told by people who have experienced acupuncture are salient and persuasive and, if generalizable, can solve your depression, cure your back pain, help you get pregnant, and improve your golf game. But anecdotes, while compelling stories, are not admissible in science.
The evidence is also clear that the effectiveness of acupuncture as a real, science-based medical treatment ends at anecdotes and the acupuncture versus no treatment “studies.” Appealing to stories and only comparing a treatment to no treatment are pseudoscience in the arena of research on humans, especially pain research.
So, what happens when we study acupuncture using the methodology of real and rigorous science?
We have done this.
When acupuncture is compared with sham acupuncture where patients think they are getting real acupuncture but aren’t (they’re getting poked and prodded at non-acupuncture sites, whether or not there is penetration with the needle) there is no statistically detectable difference between the two (Madsen et al. 2009). Indeed, in one study even when a toothpick was used instead of an acupuncture needle there was no difference in perceived benefit (Cherkin et al. 2009). Whenever a sham (the fictitious, artificial treatment) in any medical experiment is indistinguishable from an intended treatment, it is called the placebo effect.
The placebo effect is a real, measurable biological phenomenon and claims and studies that fail to acknowledge or control for the placebo effect are pseudoscience. For example, the placebo effect has been shown to lower respiration rate (Benedetti et al. 1998), lower blood pressure (Pollo et al. 2003), and can even improve motor performance in Parkinson’s patients (Goetz et al. 2000).
However, the placebo effect also confounds medical science, especially when scientists try to make sense of the results of drug trials that are being developed for pain mitigation. For example, researchers have found that anywhere from 27% to as high as 56% of subjects in pain studies responded to placebo treatment when compared to no-treatment controls (Price et al. 2008). The ability to predict which individuals are more or less likely to respond to placebo would be a critical tool for pain studies, but has been impossible… until recently.
In September of last year, the journal Science reported that specific genetic markers are being discovered in patients that respond more strongly than others to placebo treatments like sugar pills (Hall and Kaptchuk 2013, Servick 2014). If the potential for placebo can be reduced in a drug trial, smaller patient sample sizes can be used and drug trial costs will go down.
But this discovery doesn’t fix the pseudoscientific problem for acupuncture. Nor does it fix another problem with acupuncture: the nocebo effect.
The nocebo effect is when a treatment actually does harm rather than good, and alarmingly, there is also no difference in the nocebo effect of real acupuncture compared to sham (Koog et al. 2014). Reports of increased pain or patients dropping out of an acupuncture trial because of unbearable discomfort are common, as are reports of injury from acupuncture treatment. Less common is death, yet there have been five confirmed cases in the scientific literature (Ernst et al. 2011).
One response to the placebo dilemma by the CAM community is the argument that “acupuncture cannot be studied using randomized controlled trials,” because CAM, including acupuncture, treats systems that are just too complex for randomized and controlled trials to be an appropriate test of their effectiveness (Langevin et al. 2011). If this is true, then so far acupuncture cannot claim to be medical science at all because it is impossible to do any fair empirical tests of CAM’s hypotheses. The best bet for acupuncture, and CAM in general, to become a legitimate, science- based approach is to climb the ladder of strength for health science studies.
Cosmologist Carl Sagan made famous the saying, “Extraordinary claims require extraordinary evidence,” and acupuncture continues to make extraordinary claims with little, if any, evidence. Indeed, acupuncture can cure your Bursitis, Ulcers, Laryngitis, Leukopenia, Shingles, Hives, Infertility, and Tendonitis. But, if these claims aren’t extraordinary enough, acupuncture may also be an alternative to anesthesia during surgery.
I did a Web of Science search for studies that show how acupuncture can in fact be an alternative to anesthesia during surgery. I used the search terms “acupuncture” “surgery” “anesthesia” and “alternative” and got 33 returns. Twelve of the 33 were about using acupuncture to reduce post-operative vomiting, and they are highly cited by other papers, but those other papers are mostly touting the efficacy of using acupressure bracelets to reduce vomiting after anesthesia. The best, published argument I found for using acupuncture in surgery is the post-operative vomiting approach, and the National Institutes of Health (NIH) has a consensus statement that agrees. One paper on the issue of post-operative vomiting mitigation is referred to by the NIH and has been cited 139 times in the peer-reviewed scientific literature since 1999 (Lee and Done 1999).
The closest thing I could find to using acupuncture with real surgery is a case study that describes two men who had acupuncture before surgery for varicocele (an enlargement of the veins within the scrotum) and one who wanted to be circumcised at age 40, but it has never been cited in its 10-year history (Menardi et al. 2004). I suppose, in the arena of case studies, the fact that I fall asleep every time I get a tattoo is also equally convincing evidence that an acupuncture-like treatment may mimic anesthesia.
The take home message for me and for my students is that acupuncture does not reliably perform better than a sham treatment, and the nocebo effect in both acupuncture and sham is real and concerning. Another take home message is that there is no acceptable physiologically testable explanation for how acupuncture works, if it indeed actually works.
If we are to accept acupuncture as medical science and not pseudoscience, then 1) it must perform reliably in double-blind, placebo controlled experiments that exclude all other possible explanations, and 2) its mechanism of action must be transparent, quantitative, and based in known human physiology. The burden of proof is on acupuncture.
However, this entire argument against the claims that acupuncture works as a medical science does not falsify the claim that acupuncture has the potential to do something.
Acupuncture and some CAM practices are Better Than Nothing
Could it be that acupuncture and many other CAM treatments are better than nothing? Certainly, if we look at acupuncture like we look at massage, then hands down, literal hands-on approaches do reap huge benefits for the patient in measurable qualitative physiological and psychological ways that traditional medicine cannot. There is also the psychological variable of what a patient hopes and expects the outcome of a therapeutic intervention to be. Indeed, the National Center for Complementary and Integrative Health (NCCIH) at the NIH does warn that “current evidence suggests that many factors—like expectation and belief—that are unrelated to acupuncture needling may play important roles in the beneficial effects of acupuncture on pain.”
CAM promoters and practitioners often cite the NIH NCCIH as evidence for CAM as a real alternative to Western medicine. I have perused the NIH NCCIH site and am fascinated by the Research Spotlights page. It is really well done, and what jumps out at me is how tentative each headline is. For example on the Research Spotlights for 2012 page is the headline, “Meditation or Exercise May Help Acute Respiratory Infections, Study Finds,” (Barrett et al. 2012).
But there are big challenges for studies like this.
How Making Claims in Science Works (Warning: Statistics ahead, continue at your own risk)
The Barrett et al. (2012) study mentioned above lacks the group that, for example, thinks they are exercising or meditating but they are not. It may simply be a reduction in stress that is behind the measured effect, not the exercise or the meditation. But creating the group to test this hypothesis may be impossible. Again, to come to the strongest conclusions, studies must attempt to exclude all other possible explanations for the observed result. Barrett et al. (2012) understand this dilemma and end their paper with this statement:
“While not all of the observed benefits were statistically significant, the researchers noted that the magnitude of the observed reductions in illness was clinically significant. They also found that compared to the control group, there were 48 percent fewer days of work missed due to acute respiratory infections in the exercise group, and 76 percent fewer in the meditation group. Researchers stated that these findings are especially noteworthy because apart from hand-washing, no acute respiratory infection prevention strategies have previously been proven. The researchers concluded that future studies are needed to confirm these findings.”
Unfortunately, it’s not likely that this exact study will be replicated enough times, or even once, to test their additional hypotheses.
Another very recent study that has received a flurry of positive comments on social media looked at the effect of meditation and exercise on gene expression (Carlson et al. 2014)—a very sexy study, especially in the new age of epigenetics. However, we have known about the effect that stress has on telomere length for at least a decade (Price et al. 2013). It is a provocative study, but one must read the published article to get the whole story.
The authors summarize the results as follows:
“Using analyses of covariance on a per-protocol sample, there were no differences noted between the MBCR and SET groups with regard to the telomere/single-copy gene ratio, but a trend effect was observed between the combined intervention group and controls (F [1,84], 3.82; P = 0.054; effect size = 0.043); Telomere Length in the intervention group was maintained whereas it was found to decrease for control participants. There were no associations noted between changes in Telomere Length and changes in mood or stress scores over time.”
That’s a lot to digest.
The p-value above from their analysis of covariance (ANCOVA) is 0.054 and indicates that there is a greater than 1 in 20 chance that the observed “trend” is accidental and not real. Therefore, no statistical significance, and we cannot reject the null statistical hypothesis (H0) that there is no trend. The effect size of 0.043 is tiny and a handful of test subjects (literally four or five individuals) could be entirely responsible for the possible trend. In fairness, though, it is entirely possible that by concluding there is no real trend, we take the risk of making what is called a Type II error in inferential statistical hypothesis testing: failing to reject the null hypothesis when it is false—there may indeed be something going on with meditation and genetics. But the authors do admit:
“Although the current study is strengthened by randomization and the inclusion of only distressed survivors, it does have several limitations. Chief among these is missing data, which precluded the feasibility of conducting intent-to-treat analyses. The control condition was also therefore small (18 individuals), because twice as many women were randomized to the active intervention groups as to the control condition. Hence, the study would require approximately twice as many participants in each group to detect a change in the T/S ratio of 0.5.”
Indeed, we must be tentative with results like these, but most people read and pass along headlines without skepticism. They do not, in fact cannot, dig into the details of the study due to limitations in being able to read technical language and understand statistical results. All we can conclude from this study is that, taken as a whole, the results neither confirm nor refute the utility of meditation.
However, the fact that meditation is incredibly healthful and is better than nothing is undeniable. But, unlike acupuncture, it’s free!
In summary, we can cure our students (our future voters) of pseudoscience and pseudoscientific thinking by exposing them to the claims of practices like acupuncture that masquerade as medical science and by helping them identify and unpack the pseudoscientific assertions of these practices.
Curing our students of pseudoscience may be our most important role as science educators.
Peer-reviewed Literature Cited
Barrett, B., Hayney, M. S., Muller, D., Rakel, D., Ward, A., Obasi, C. N., … & Coe, C. L. (2012). Meditation or exercise for preventing acute respiratory infection: a randomized controlled trial. The Annals of Family Medicine, 10(4), 337-346.
Benedetti, F., Amanzio, M., Baldi, S., Casadio, C., Cavallo, A., Mancuso, M., … & Maggi, G. (1998). The specific effects of prior opioid exposure on placebo analgesia and placebo respiratory depression. Pain, 75(2), 313-319.
Carlson, L. E., Beattie, T. L., Giese‐Davis, J., Faris, P., Tamagawa, R., Fick, L. J., … & Speca, M. (2014). Mindfulness-based cancer recovery and supportive-expressive therapy maintain telomere length relative to controls in distressed breast cancer survivors. Cancer.
Cherkin, D. C., Sherman, K. J., Avins, A. L., Erro, J. H., Ichikawa, L., Barlow, W. E., … & Deyo, R. A. (2009). A randomized trial comparing acupuncture, simulated acupuncture, and usual care for chronic low back pain. Archives of Internal Medicine, 169(9), 858-866.
Ernst, E., Lee, M. S., & Choi, T. Y. (2011). Acupuncture: does it alleviate pain and are there serious risks? A review of reviews. Pain, 152(4), 755-764.
Goetz, C. G., Leurgans, S., Raman, R., & Stebbins, G. T. (2000). Objective changes in motor function during placebo treatment in PD. Neurology, 54(3), 710-710.
Hall, K. T., & Kaptchuk, T. J. (2013). Genetic biomarkers of placebo response: what could it mean for future trial design?. Clinical Investigation, 3(4), 311-313.
Koog, Y. H., Lee, J. S., & Wi, H. (2014). Clinically meaningful nocebo effect occurs in acupuncture treatment: a systematic review. Journal of Clinical Epidemiology.
Kuhn, D. (1993). Science as argument: Implications for teaching and learning scientific thinking. Science Education, 77(3), 319-337.
Langevin, H. M., Wayne, P. M., MacPherson, H., Schnyer, R., Milley, R. M., Napadow, V., … & Hammerschlag, R. (2010). Paradoxes in acupuncture research: strategies for moving forward. Evidence-Based Complementary and Alternative Medicine, 2011.
Lee, A., & Done, M. L. (1999). The use of nonpharmacologic techniques to prevent postoperative nausea and vomiting: a meta-analysis. Anesthesia & Analgesia, 88(6), 1362-1369.
Madsen, M. V., Gøtzsche, P. C., & Hróbjartsson, A. (2009). Acupuncture treatment for pain: systematic review of randomised clinical trials with acupuncture, placebo acupuncture, and no acupuncture groups. BMJ, 338.
Minardi, D., Ricci, L., & Muzzonigro, G. (2004). Acupunctural reflexotherapy as anaesthesia in day-surgery cases. Our experience in left internal vein ligature for symptomatic varicocele and in circumcision. Arch Ital Urol Androl, 76(4), 173-4.
Pollo, A., Vighetti, S., Rainero, I., & Benedetti, F. (2003). Placebo analgesia and the heart. Pain, 102(1), 125-133.
Price, D. D., Finniss, D. G., & Benedetti, F. (2008). A comprehensive review of the placebo effect: recent advances and current thought. Annu. Rev. Psychol., 59, 565-590.
Price, L. H., Kao, H. T., Burgers, D. E., Carpenter, L. L., & Tyrka, A. R. (2013). Telomeres and early-life stress: an overview. Biological Psychiatry, 73(1), 15-23.
Servick, K. (2014). Outsmarting the placebo effect. Science, 345(6203), 1446-1447.
Witt, C. M., Jena, S., Selim, D., Brinkhaus, B., Reinhold, T., Wruck, K., … & Willich, S. N. (2006). Pragmatic randomized trial evaluating the clinical and economic effectiveness of acupuncture for chronic low back pain. American Journal of Epidemiology, 164(5), 487-496.