Feb 25, 2020

The Stanford Marshmallow Experiment Was Wrong: Here’s Why and How Open Science Can Help

Funders and researchers should embrace open science principles in their pursuit of scientific breakthroughs.

By Andrew Serazin

This article was written in collaboration with Dawid Potgieter.

One of the best known social science experiments is the “Stanford marshmallow experiment.” Psychologists Walter Mischel and Ebbe Ebbesen, conducted a simple experiment to — supposedly — measure self control in children and how delayed gratification indicated later success in life. They first published their results in the Journal of Personality and Social Psychology in 1970, and the marshmallow experiment went on to be enormously influential in the psychology community and embraced by the public. The only problem is that the experiment didn’t actually prove what it claimed.

The marshmallow experiment was simple: The researchers would give a child a marshmallow and then tell them that if they waited 15 minutes to eat it they would get a second one. Subsequent research over the course of the next several decades indicated that the children that practiced delayed gratification fared better in life later on, garnering higher test scores and lower BMI, for instance. These findings permeated the popular press, the original publication from 1970 was cited more than 250 times in scientific journals over the years, and the researchers won awards for their work.

Yet it turns out that a growing body of follow-up research didn’t actually support the original findings. In 2018, the results of a new study were published that showed a much smaller correlation between delayed gratification in children and success later in life. Where the initial marshmallow experiment had been conducted with just 90 children, all pupils at the Stanford University preschool, the new study tried to reproduce the same results with a much larger pool of 900 children drawn from a range of socioeconomic backgrounds. Ultimately, the researchers failed to replicate the results of the famous marshmallow experiment; rather, their results now indicate that socioeconomics was the determining factor behind delayed gratification and later success in life.

How is it possible that it took decades for researchers to try and reproduce one of the highest impact pieces of social science of the last 50 years? Mischel and Ebbesen had long argued that larger sample sizes might render different results, yet no one reproduced the experiment. Preventing similar quagmires in the future could vastly improve the quality of science and the likelihood of breakthroughs. Certainly Mischel and Ebbesen weren’t being duplicitous in their conclusions, nor were the editors of the journal out to lunch when they published the research. Rather, the Stanford marshmallow experiment is a prime example of what some now call a replication crisis plaguing some scientific disciplines.

At Templeton World Charity Foundation (TWCF), we are committed to interdisciplinary research on what it means to be human. We support scientists in fields such as psychology, social science and neuroscience who seek to unravel some of the biggest mysteries about human nature to help people and societies flourish. By necessity, these researchers sometimes study abstract concepts and complex processes. The experimental data can also sometimes support multiple interpretations, so it’s easy to make mistakes and difficult to spot them.

Historically, the status quo often did not require scientists to share their data, or only required sharing only data that had already been sorted and analyzed by researchers. This makes it harder for other researchers to reproduce experiments and to understand the implicit biases or hidden assumptions that may be baked into findings.

In the case of the marshmallow experiment, different researchers could have drawn radically different conclusions from the same data, for instance. Moreover, the success of research is often determined based on its impact — the number of citations in other publications, for instance — rather than on the overall quality of the data. While citation count is an understandable proxy for quality, it can also become a distraction in a hyper-competitive environment. Such metrics can pressure scientists to generate exciting results and can encourage behaviors such as HARKing (hypothesizing after the results of a study are known, rather than before) or cherry picking data.

The replication challenges we have seen undermine trust in science and hamper breakthrough discoveries. Funding agencies can be a big part of the solution to this problem. TWCF — along with other major organizations in the Open Research Funders Group — is putting the promotion of open science principles at the core of its grant-making.

Open science principles aim to encourage clarity in scientific research and lay the groundwork for scientific findings that can be reproduced and shared openly. By following and promoting these principles, we hope that it will be easier for anyone with an internet connection to access the latest scientific discoveries and to know whether the data can be trusted. These guiding principles are relatively simple:

  1. Register the details of your experiment in advance. This includes disclosure of key methods that you will use to collect and interpret your data. All else being equal, this will allow people to have greater confidence in your findings.
  2. Publish all results, both positive and negative. People need to know what doesn’t work just as much as they need to know what works.
  3. Share your findings with the world. If you want people to benefit from your research then allow them to see it by making it freely available.

In our experience, almost all scientists appreciate these principles, and most non-scientists are surprised when they learn that this isn’t standard practice already. Nonetheless, it can sometimes be very difficult to follow these principles, and in a hyper-competitive environment they often take a back seat. To really follow these principles, researchers have to go the extra mile, and funders need to provide better incentives for doing so.

Our own funding priorities include big questions at the core of human nature. What aspects of our intelligence are distinctive? How can we use technology to aid moral decision making? How can we fully harness the powers of the mind? Poor data collection methods can be easily hidden if other researchers do not have access to raw data and methodologies. Studies that return unexpected results can be tweaked; positive results can be published while ambiguous findings can be dropped and correlations and coincidences can be easily passed off as hard findings. By promoting open science principles as a funding institution, we hope to make all of these acrobatics impossible so that our grantees can become a powerful driver of scientific breakthroughs, at a much quicker speed, than business as usual.

One initiative where we have been trialling best practices in open science is called Accelerating Research on Consciousness. This is a $20 million commitment to using adversarial collaboration and open science best practices to design and conduct experiments that would not otherwise be possible. Our ultimate goal for this initiative is to reduce the number of scientific theories of consciousness through empirical disproof.

By backing experiments and scientists that practice these principles, TWCF aims to accelerate scientific inquiry and improve the quality of science simultaneously by rewarding scientists for sharing data, reproducing experiments and not being shy about publishing inconclusive, or even negative, findings. Open science principles can help create greater credibility in the field and accelerate scientific progress.