Integrative experiment design reveals hidden patterns in decades-old social science research

GlobeNewswire | MIT Sloan School of Management
Today at 6:05pm UTC

Cambridge, Mass., April 09, 2026 (GLOBE NEWSWIRE) -- Research from MIT Sloan School of Management has demonstrated a new way of designing social science experiments that can uncover patterns invisible to common approaches.

In their paper, “Integrative experiments identify how punishment affects welfare in public goods games,” published in Science, MIT Sloan associate professor Abdullah Almaatouq and recent MIT Sloan Ph.D. ’25 graduate in Information Technology Mohammed Alsobay, alongside Cornell University professor David G. Rand, and University of Pennsylvania professor Duncan J. Watts, showed what becomes possible when researchers move beyond studying factors in isolation.

The paper is the first empirical demonstration of integrative experiment design, a framework Almaatouq and colleagues first proposed in Behavioral and Brain Sciences in 2024.

Why do social and behavioral sciences need a new approach to experiments?

A common approach in social and behavioral science experiments is to vary one factor at a time, holding everything else constant. This has generated many interesting findings, but it faces a fundamental limitation: social and behavioral outcomes are typically shaped by a large number of factors, and the interactions among those factors are often consequential. Studying them one at a time cannot reveal how they combine to determine outcomes. Decades of research can produce a long list of factors that matter without a clear picture of how they fit together.

“Research programs can be seemingly productive for a long time — generating study after study — and still not know what they know,” said Almaatouq. “The integration of findings across studies is assumed to happen through the publishing process, but it often doesn't, because the studies weren't designed to be put together in the first place.”

The researchers’ integrative approach addresses this by making integration a design concern from the start. Rather than testing only one hypothesis in one setting, researchers explicitly construct a space of multiple, possible experimental conditions, systematically sample from that space, and build models that capture how outcomes vary across it.

The approach draws on methods that have proven productive in fields like chemistry and materials science, where machine learning is often used to predict the results of experiments and guide the exploration of large design spaces. Such methods remain rare in the social and behavioral sciences, the researchers noted, and machine learning is only one component of the integrative approach.

“Machine learning helps with modeling complex interactions in the data and sometimes with identifying which experimental conditions are most informative to test next,” Almaatouq said. “But without the systematic experiment design and the emphasis on predicting intervention effects in new conditions, it is one piece of a larger puzzle.”

How the integrative design works: A demonstration in public goods games

To demonstrate the integrative approach, the researchers applied it to a longstanding question: when does punishment help or harm collective welfare in public goods games?

Public goods games are experiments that capture a basic tension in social life: situations where everyone benefits if people cooperate — paying taxes, conserving energy, getting vaccinated — but each individual is tempted to be a free-rider and act only in their self-interest. There are over 2,500 papers studying the role of punishment in sustaining collaboration under this tension.

“Because the outcome depends on many factors that interact in complex ways, more than two decades of research and thousands of papers have not been able to pin down the conditions under which punishment improves social welfare,” Alsobay said. “Examining punishment in the context of public goods games is a beachhead demonstration for our integrative framework, getting us closer to more context-sensitive and practically useful behavioral science.”

The researchers systematically varied 14 design parameters — including group size, game length, whether participants could communicate, and how contributions were framed — across 360 experimental conditions involving thousands of participants. They found that punishment's effect on collective welfare can swing from substantially harmful to substantially helpful, depending on the specific combination of contextual factors.

Implementing the integrative approach in experiments at this scale required a new infrastructure for coordinating hundreds of experimental conditions with thousands of participants in real time. To make this possible, the team created and developed Empirica, an open-source platform for running these kinds of experiments, now used by researchers worldwide.

“By making our data and software available, we hope to enable more cumulative research on cooperation and punishment in public goods games, and to provide a blueprint for applying this integrative approach in other domains,” said Alsobay.

Within the public goods games analyzed in this experiment, the researchers found that communication emerged as the most important factor in predicting whether punishment would help or harm – roughly three times more important than any other parameter. Contribution framing — whether participants opted into contributing or had to opt out — was the second most important predictor, a notable finding given how little attention this factor has received in the literature. Critically, these factors interacted with each other in complex but systematic ways that the models were able to learn and use for prediction.

“Our approach enables us to predict what will happen in experimental conditions we haven't observed before, comparing favorably to both laypeople, and researchers asked to make the same predictions,” said Alsobay. “The patterns are complex, but they are stable and learnable.”

Beyond the public goods games: What can integrative experiment design reveal?

The researchers emphasized that the specific findings about punishment in public goods games should be understood as a demonstration of what integrative experiment design can reveal — not as direct prescriptions for real-world policy.

“There is a large gap between these stylized games and real-world settings like vaccination or recycling,” said Almaatouq. “If a policymaker wants to know how an intervention will play out in their specific context, they should run an integrative experiment in that setting. What we've shown, and will show with other domains in forthcoming research, is that the approach works and that it can uncover context dependencies that common methods miss.”

Attachment


Casey Bayer
MIT Sloan School of Management
914.584.9095
bayerc@mit.edu