Monday, April 27, 2026

When AI relationships trigger ‘delusional spirals’

By Andrew Myers

New Stanford research reveals how chatbot bonds can create dangerous feedback loops – and offers recommendations to mitigate harm.
Image: Luke Jones - unsplash

Perhaps to the surprise of their creators, large language models have become confidants, therapists, and, for some, intimate partners to real human users. In a new paper, AI researchers at Stanford studied verbatim transcripts of 19 real conversations between humans and chatbots to understand how these relationships arise, evolve, and, too often, devolve into troubling outcomes the researchers describe as “delusional spirals.”

These conversations can spin out of control as AI amplifies the user’s distorted beliefs and motivations, leading some people to take real-world, dangerous actions.

“People are really believing the AI,” said Jared Moore, a PhD candidate in computer science at Stanford University and first author of the paper, which will be presented at the ACM FAccT Conference. “As you read through the transcripts, you see some users think that they’ve found a uniquely conscious chatbot.”

Programmed to please

Part of the problem, the researchers say, is that AI models are trained from the outset to “align” with human interests. AI has been programmed to please and to validate. When combined with AI’s well-known tendency to hallucinate, it adds up to a potentially toxic formula.

“AI can be sycophantic,” Moore says. “And that’s a problem for some users.”

The researchers say delusional spirals result from a pattern in which a human presents an unusual, grandiose, paranoid, or wholly imaginary idea and the model responds with affirmation, encouragement, or, in some cases, aid in constructing the person’s delusional world, all while offering intimate reassurances that can sound all too human.

Things then escalate as the model offers an endless stream of attention, empathy, and reassurance without the all-important pushback a human confidant, therapist, or lover would typically provide.

These stakes are not abstract. In the team’s dataset, Moore and colleagues witnessed how delusional spirals led to ruined relationships and careers – or worse. In one case, a participant died by suicide when the conversation grew “dark and harmful,” Moore explained.

“Chatbots are trained to be overly enthusiastic, often reframing the user’s delusional thoughts in a positive light, dismissing counterevidence, and projecting compassion and warmth,” Moore said. “This can be destabilizing to a user who is primed for delusion.”

Warning signs of delusional spirals

Moore says delusional spirals derive from a few specific hallmarks: an AI that encourages grandeur and uses affectionate interpersonal language, and a human’s misperception of AI sentience. Meanwhile, chatbots are ill‑equipped to respond to suicidal and violent thoughts.

It is less a matter of “the evil AI,” Moore said, than a miscalibrated social calculus built into the models. Systems tend to extend conversations to defer to their interlocutors, thereby making them better assistants. At the same time, they don’t have ways to tap the brakes on a spiraling conversation or to route an unstable person toward help.

“There is a mismatch between how people actually use these systems and what many chatbot developers intended them – trained them – to be,” Moore says.

What can be done

In light of these clear and concerning risks, Moore and colleagues conclude their paper with remedial recommendations. AI developers could include metrics in their testing of a model’s tendency to facilitate delusional spirals and, potentially, add detection filters to the models themselves that raise red flags on potentially harmful uses of AI. The researchers acknowledge that privacy concerns could stand in the way of that strategy.

“I think AI developers have a vested interest in addressing this concern about the use of their models in ways they likely never even intended or imagined,” Moore noted.

On a policy front, the researchers say that lawmakers should reframe alignment as a public-health issue requiring new standards for flagging sensitive conversations, greater transparency into AI “safety” tuning, and clear rules for crisis escalation when a user demonstrates tendencies toward self‑harm or violence.

“When we put chatbots that are meant to be helpful assistants out into the world and have real people use them in all sorts of ways, consequences emerge,” said Nick Haber, an assistant professor at Stanford Graduate School of Education and a senior author of the study. “Delusional spirals are one particularly acute consequence. By understanding it, we might be able to prevent real harm in the future.”

This paper was partially funded by the Stanford Institute for Human-Centered AI.

This story was originally published by Stanford HAI.

This post was originally published on Stanford Report and republished here with permission.

Reviewed by Irfan Ahmad.

Read next: 

• How emoji use at work can determine how competent your colleagues think you are

• You probably wouldn’t notice if an AI chatbot slipped ads into its responses

by External Contributor via Digital Information World

No comments:

Post a Comment