When artificial intelligence makes decisions for people during social interactions, their human partners become less trustworthy, fair, and cooperative, ultimately leading to poor outcomes for everyone involved. But when people aren’t sure if an automated program is pulling the strings, they often behave as usual and secretly rely on the technology itself. These findings were recently published in the journal PNAS Nexus.
Artificial intelligence programs, known as large-scale language models, can generate human-like text and answer complex queries. These tools are increasingly integrated into everyday life, helping people draft emails, resolve conflicts, and make choices with social implications. The near-universal nature of these text-based models makes them fundamentally different from older algorithms designed for narrow tasks like playing chess or sorting data.
As these conversational algorithms take a more active role in mediating communication, questions arise about how people will respond to a machine acting on behalf of another human being. Online interactions are increasingly text-based, and algorithms are likely to play a larger role in forming relationships. Understanding how people respond to this change is a major goal for behavioral scientists.
Past research has investigated how people interact with specialized systems designed to optimize very specific tasks. However, few studies have examined how artificial intelligence in everyday conversations affects cooperation between real people. Fabian Dvorak, a behavioral researcher at the University of Konstanz in Germany, led a team of economists to investigate this phenomenon.
The research team wanted to observe human behavior when algorithms intervene to make choices that directly affect other people’s incomes. To study this, the researchers used a series of classic economic games. These are standardized scenarios used by economists and psychologists to measure social behaviors such as reciprocity, cooperation, and altruism.
In these exercises, participants make choices that determine how real money will be divided between themselves and an anonymous partner. Over 3,000 participants were recruited through an online platform to play five different two-player economic games. One of the setups is an ultimatum game, where one person proposes a way to split a large sum of money, and the partner must accept or reject the proposal.
If a partner refuses to split in the ultimatum game, neither side receives any money. This forces the offeror to offer a fair amount to avoid a complete loss. Another scenario is the trust game, where sending money to a partner doubles its total value. The receiving partner then decides how much of that multiplication sum to return to the original sender, testing the sender’s trustworthiness.
The researchers also included a prisoner’s dilemma, an exercise in which both players can cooperate for mutual benefit or betray each other for selfish gain. Another scenario, the Stag Hunt game, requires players to choose between hunting a large buck together for a high reward, or selfishly catching a rabbit alone for a guaranteed smaller reward. If one person goes for a deer and the other goes for a rabbit, the deer hunter gets nothing.
Finally, the test included coordination exercises. In this game, players select options from a list, such as planets in the solar system. A larger payout can only be won if both players independently choose the exact same option without prior communication.
In each match, one participant had the option or requirement to defer the decision to their version of ChatGPT. Ultimately, both participants will receive monetary payments based on their final selections. Researchers offered varying views on whether the human partners knew about the algorithm’s involvement in the game.
The study involved situations where artificial intelligence randomly and openly took over. In other setups, participants can choose to hand over their choices to the machine. The team then tested situations in which this handoff was either completely transparent or completely hidden from the human partner.
The team also experimented with personalizing the algorithm to a user’s specific personality traits. Participants completed a short questionnaire about their preferences, including whether they preferred introversion over extroversion and truth over conformity. The researchers entered these answers into the initial instructions given to ChatGPT, which shaped subsequent decisions.
The result was a strong negative reaction to the known use of artificial intelligence. Across all five games, participants were less cooperative and showed lower levels of trust when they learned that their opponent’s decisions were made by ChatGPT. This early hesitation resulted in financial losses for both players.
Participants expected algorithms to be selfish and less fair than human players. As a result, they went on the defensive early in the game. They offered less money in split scenarios and refused to cooperate in team exercises, assuming the algorithm would betray them. In the stag hunt game, participants assumed the machine would not cooperate and were more likely to settle for the safe option of hunting the rabbit.
The irony of the situation is that the algorithms themselves tended to make cooperative and generous choices. Artificial intelligence typically chooses to maximize the total payout for both players. All economic losses in these transparent scenarios were caused by human players actively withholding trust.
Coordination games were the only exception. Artificial intelligence rarely chooses the most popular or expected answers, leading to throttling errors. This was the only instance where the machine’s choices directly lowered the payout by behaving unpredictably.
Once the involvement of artificial intelligence was kept secret, a completely different pattern emerged. When participants were unsure whether they were interacting with a human or a machine, they behaved exactly as they would a normal human partner. Even under these hidden conditions, their economic returns did not decline.
Despite negative social reactions, people still preferred to use technology for their own benefit. When given the opportunity to let a machine make a decision, participants did so approximately 40% of the time. In particular, you are more likely to hand over control when the other player does not know about the machine.
Making decisions in these games requires cognitive effort and carries the risk of feeling guilty about selfish choices. Delegating tasks to a program relieves the burden of writing justifications and prevents human players from feeling personally responsible for the outcome.
Trying to make the algorithm behave like a particular user did not solve the economic damage. Although the decision makers felt that the customized algorithm represented them somewhat better, their human partners still responded negatively. Regardless of its programmed personality, simply knowing that a machine was involved was enough to trigger defensive behavior.
In a secondary test, the team asked independent evaluators to figure out which decisions were made by humans and which were due to the algorithm. When looking only at the final numerical selection, raters were unable to distinguish between the two. This algorithm is fully integrated into normal human behavior.
However, evaluators were able to spot artificial intelligence if they were allowed to read short written justifications for their decisions. This algorithm tended to use technical language focused on potential financial consequences. Human players typically explain their choices in simpler, more casual terms.
The researchers noted that their study had several limitations. Participants were primarily from a specific online platform in the UK, so the findings may not apply to all demographic groups. This study only investigated isolated, one-time interactions between strangers.
In the real world, operating the same automatic program repeatedly over long periods of time can cause people to adapt their behavior. Other ways to personalize a program may also result in different social responses. Although the current study relied on direct text prompts to shape the behavior of the language model, this is just one way to customize the tool.
The study raises questions about new technology regulations designed to increase transparency. Laws like the European Union’s Artificial Intelligence Act require companies to disclose when content is machine-generated. Informing people that algorithms are involved can unintentionally undermine trust and undermine economic cooperation.
Future research will need to explore ways to build public trust in these systems, rather than simply announcing their existence. Until people feel comfortable interacting with machines, mandatory transparency may lead to defensive behavior and less social cohesion.
The study, “Side Effects on the Use of Large-Scale Language Models in Social Interaction,” was authored by Fabian Dvorak, Regina Stumpf, Sebastian Fehrler, and Urs Fischbacher.

