Reading the Mindreading Studies
Current Debates About fMRI Research Methods Bear on Policy Questions
Over the last decade, Functional Magnetic Resonance Imaging, or fMRI, has become an indispensable tool for cognitive neuroscientists who explore the brain processes underlying human behavior. Some particularly fertile areas of fMRI research have been in social cognition, personality, and emotion; work in these disciplines has garnered millions of dollars in federal grant support and has generated wide interest from scientists, policymakers, and the public.
Part of the appeal of this research is that it often involves investigation of how the brain responds to familiar emotional stimuli and social conditions. Yet questions persist about what fMRI can really tell us about how the brain works, and the research has implications for a variety of issues ranging from brain-reading devices to the ethical use of brain enhancement technologies.
Importantly, fMRI is no longer just a medical or academic tool used to diagnose disease or learn about basic cognitive functions in the brain. It is now widely viewed, and many would say mistakenly, as a potential way to solve problems in court and in the interrogation room—by helping discern what individuals are thinking. How we balance the inherent technological drawbacks of fMRI research and the ethical minefields of its application with the potential for profound discoveries about how the mind works promises to be a point of great contention in the future.
fMRI research focused on the relationship between mental states and behavior abounds. A recent study demonstrated that lonely or socially isolated individuals, when shown images of people in pleasant settings, had much lower activation in a reward center of the brain, the ventral striatum, than non-lonely people. And Science recently published an article showing the activation of brain reward centers in subjects witnessing the misfortune of an envied competitor and activation in punishment centers when they saw an envied competitor with a valuable object. The study argued that brain regions responding to feelings of envy and schadenfreude are also those that respond to, respectively, physical pain (envy hurts) and reward/pleasure (schadenfreude feels good).
The most common form of fMRI measures the “blood oxygen level dependent,” or BOLD, signal in the brain, which results from the differing flow of oxygenated and deoxygenated blood through the brain. Brain areas that receive oxygenated versus deoxygenated blood are detected by the MRI scanner because the blood protein hemoglobin has different magnetic properties when it carries oxygen compared to when it does not. Why is it important to know which brain areas are receiving more oxygenated blood than others? Because in a process called the hemodynamic response, blood supplies oxygen to active, “thinking” neurons at a greater rate than to inactive neurons. Using complex statistical methods, researchers can evaluate which areas of brain are consistently receiving more oxygenated blood (a high BOLD signal), therefore revealing which areas of the brain are “active” during the specific thoughts or sensory experiences induced by researchers.
But lately some have challenged the validity of fMRI as a tool for drawing these connections between thoughts/experiences and brain activation. Massachusetts Institute of Technology graduate student Ed Vul recently published a paper called “Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition.” In it, he argues that a large proportion of fMRI studies in fact utilize spurious and biased statistical processes, resulting in impossibly high correlations between assessments of individual emotional or personality differences and associated brain region activation.
How to Read the Numbers
fMRI is a decidedly indirect measure of brain activity, as it does not measure “thinking” processes or even neural changes directly, but merely oxygenated blood flow. Scientists have even discovered that blood flow through astrocytes, glial cells that are thought to play a largely supportive role in the brain, are the main source of the fMRI signal, not neurons. In other words, the BOLD signal may not be the unquestionably valid representation of cognitive processes that researchers sometimes claim it is.
The dominant methodology utilized in fMRI research involves scouring the brain for statistically significant correlations between the BOLD signal and a specific emotion, behavior, or disease state—envy, loneliness, “right-handedness,” schizophrenia, etc. Researchers divide the brain into tiny three-dimensional pixels called “voxels” and examine them for significant increases in oxygenated blood flow correlated with the stimuli. Those areas that show significant correlations are plotted onto a structural image of the brain as a functional map of the brain-behavior association.
In his paper, Vul highlights a number of issues with fMRI studies, most prominently the existence of misleadingly high, or “voodoo,” correlations between brain signal and individual behavioral differences, like personality and emotion. He claims that these spuriously high correlations are the result of a non-independence error. He explained this error to science writer Jonah Lehrer in Scientific American:
When researchers want to determine which parts of the brain are correlated with certain aspect of behavior, they must somehow choose a subset of thousands of voxels [to study]. One tempting strategy is to choose voxels that show a high correlation with this behavior. So far this strategy is fine.
The problem arises when researchers then go on to provide their readers with a quantitative measure of the correlation magnitude measured just within the voxels they have pre-selected for having a high correlation. This two-step procedure is circular: it chooses voxels that have a high correlation, and then estimates a high average correlation. This practice inflates the correlation measurement because it selects those voxels that have benefited from chance, as well as any real underlying correlation, pushing up the numbers.
In other words, Vul argues that when researchers select voxels that exhibit a high correlation between oxygenated blood flow and response to stimuli, they are choosing both voxels that have a high correlation due to chance along with those that really do exhibit the correlation. The non-independence error results when researchers use these high-correlation voxels to estimate high average correlation across the whole brain. That is, the analysis (Does the brain have a high average correlation with a specific individual difference?) is not independent of the initial selection criteria (Which voxels exhibit a high correlation with a specific individual difference?).
The paper’s criticism of fMRI studies and their “voodoo correlations” went over well with many in traditional media as well as in the blogosphere. Several writers claimed that Vul’s work finally revealed that the whole field of fMRI was based on false foundations. Sharon Begley of Newsweek wrote that “psychiatrists and social psychologists [are] enamored [by] fMRI and other brain imaging toys . . . like so many researchers in the social sciences, they have physics envy, and think that the illusory precision and solidity of neuroimaging can give their field some rigor.”
In response to Vul’s claims, brain researchers Matthew D. Lieberman and Elliot T. Berkman from the University of California, Los Angeles, and Tor D. Wager from Columbia University published a detailed reply. They wrote that the non-independence error that Vul claims is the cause of spuriously high correlations in a number of studies actually does not occur. In a subsequent interview with Lehrer, Professor Lieberman responded to Vul’s charge of non-independence, explaining that fMRI researchers are not interested in how the whole brain correlates with a measure of individual difference. Instead they are interested in which specific areas, or “voxels,” in the brain show a significant difference in blood flow in response to stimuli:
[Vul suggests] that we might be interested in whether a psychology or a sociology course is harder and assess this [question] by comparing the grades of students who took both courses. In a comparison of all students, we find no difference in scores. But what if we began by selecting only students who scored higher in psychology than sociology and then statistically compared those? If we used the results of that analysis to draw a general inference about the two courses, this [strategy] would be a non-independence error, because the selection of the sample to test is not independent of the criterion being tested. This [practice] would massively bias the results.
Although Vul is absolutely right that this would be a major error, he’s not describing what we actually do [in social fMRI]. Vul’s example assumes that the question that we are interested in is how the entire brain correlates with a personality measure or responds differently to two tasks. Staying with the grades examples, what social neuroscientists are really doing, however, is something closer to asking, “Across all colleges in the country, are there colleges where psychology grades are higher than sociology grades?” In other words, the question is not what the average difference is across all schools, but rather which schools show a difference. There is nothing inappropriate about asking this question or about describing the results found in those schools where a significant effect emerges.
With whole-brain analyses in fMRI, we’re doing the same thing. We are interested in where significant effects are occurring in the brain and when we find them we describe the results in terms of means, correlations, and so on. We are not cherry-picking regions and then claiming these represent the effects for the whole brain.
Vul responded with a rebuttal of his own to the rebuttal above, claiming that his criticism of the non-independence error still applies, so the debate continues. But we should neither blindly accept Vul’s critiques nor Lieberman, Berkman, and Wager’s responses—fMRI is neither a perfect technology, nor is it fundamentally flawed.
fMRI in Court
The utility of this sort of brain research has policy implications because the results of this work might end up in court. Skeptics of the validity of fMRI have expressed their worries about the recent news that for the first time, defense attorneys submitted results from an fMRI lie-detection test as evidence in a trial—although the evidence was withdrawn in late March by the lawyers. No Lie MRI, a private company, scanned the defendant in the juvenile sex-abuse case and claimed that its test revealed that the abuse did not in fact happen because the defendant’s claim of innocence did not show neural patterns consistent with a lie. No Lie MRI uses fMRI to measure changes in blood blow to the ventrolateral area of the prefrontal cortex, a section of the brain in which several studies have identified activity during lying.
Studies on fMRI lie detection have identified lying with accuracies of 76 percent to over 90 percent. However, many people are suspicious of the reliability of this new technology, and are apprehensive about using it in court. Ed Vul said in a comment in Wired: “I don’t think [fMRI lie detection] can be either reliable or practical. It is very easy to corrupt fMRI data. The biggest difficulty is that it’s very easy to make fMRI data unusable by moving a little, holding your breath, or even thinking about a bunch of random stuff. So far as I can tell, there are many more reliable ways to corrupt data from an MRI machine than a classic polygraph machine.”
Hank Greely, Director of the Center for Law and the Biosciences at Stanford University, has also expressed skepticism about admitting such a young and poorly understood technology. He told Wired that “having studied all the published papers on fMRI-based lie detection, I personally wouldn’t put any weight on it in any individual case. We just don’t know enough about its accuracy in realistic situations.”
Concerns about the use of an untested technology like fMRI lie detection to determine the fate of individuals in our legal system are understandable and appropriate. Before we even consider using MRI lie detection in the courts, randomized studies with hundreds of participants must reveal the unequivocal reliability of the techniques. And this has not happened…yet.
However, with respect to Vul’s critiques of the validity of brain imaging in social neuroscience, the mainstream press and the bloggers should not necessarily trash a well-established and useful tool in both medical research and clinical medicine. fMRI studies, though by no means perfect, have provided remarkable and valuable insight into the important and often nebulous connections between the brain and mind, revealing the extent to which our emotional and social lives emerge from specific biological processes.
As brain science continues to advance, it is vital for researchers and the public alike to step back and rigorously examine the basic techniques and assumptions of fMRI, whether the technology is used in social neuroscience, lie detection, or elsewhere. We don’t want to base our science policy, medical judgment, or court decisions on data that is not fully understood, or perhaps even fundamentally flawed. Nevertheless, the proper response to the Vul study and the development of fMRI lie detection technologies is not to throw up our hands in despair, but to respond with reasoned and thoughtful rebuttals like those outlined earlier, firmly committing ourselves to the improvement of imaging techniques, data interpretation, and experimental design through the continued support of neuroscience research.
Justin Masterman is an intern with the Progressive Bioethics Initiative at the Center for American Progress.
Comments on this article