On Reddit forums, many users discussing mental health have enthused about their interactions with ChatGPT—OpenAI’s artificial intelligence chatbot, which conducts humanlike conversations by predicting the likely next word in a sentence. “ChatGPT is better than my therapist,” one user wrote, adding that the program listened and responded as the person talked about their struggles with managing their thoughts. “In a very scary way, I feel HEARD by ChatGPT.” Other users have talked about asking ChatGPT to act as a therapist because they cannot afford a real one.

The excitement is understandable, particularly considering the shortage of mental health professionals in the U.S. and worldwide. People seeking psychological help often face long waiting lists, and insurance doesn’t always cover therapy and other mental health care. Advanced chatbots such as ChatGPT and Google’s Bard could help in administering therapy, even if they can’t ultimately replace therapists. “There’s no place in medicine that [chatbots] will be so effective as in mental health,” says Thomas Insel, former director of the National Institute of Mental Health and co-founder of Vanna Health, a start-up company that connects people with serious mental illnesses to care providers. In the field of mental health, “we don’t have procedures: we have chat; we have communication.”

But many experts worry about whether tech companies will respect vulnerable users’ privacy, program appropriate safeguards to ensure AIs don’t provide incorrect or harmful information or prioritize treatment aimed toward affluent healthy people at the expense of people with severe mental illnesses. “I appreciate the algorithms have improved, but ultimately I don’t think they are going to address the messier social realities that people are in when they’re seeking help,” says Julia Brown, an anthropologist at the University of California, San Francisco.

A Therapist’s Assistant

The concept of “robot therapists” has been around since at least 1990, when computer programs began offering psychological interventions that walk users through scripted procedures such as cognitive-behavioral therapy. More recently, popular apps such as those offered by Woebot Health and Wysa have adopted more advanced AI algorithms that can converse with users about their concerns. Both companies say their apps have had more than a million downloads. And chatbots are already being used to screen patients by administering standard questionnaires. Many mental health providers at the U.K.’s National Health Service use a chatbot from a company called Limbic to diagnose certain mental illnesses.

New programs such as ChatGPT, however, are much better than previous AIs at interpreting the meaning of a human’s question and responding in a realistic manner. Trained on immense amounts of text from across the Internet, these large language model (LLM) chatbots can adopt different personas, ask a user questions and draw accurate conclusions from the information the user gives them.

As an assistant for human providers, Insel says, LLM chatbots could greatly improve mental health services, particularly among marginalized, severely ill people. The dire shortage of mental health professionals—particularly those willing to work with imprisoned people and those experiencing homelessness—is exacerbated by the amount of time providers need to spend on paperwork, Insel says. Programs such as ChatGPT could easily summarize patients’ sessions, write necessary reports, and allow therapists and psychiatrists to spend more time treating people. “We could enlarge our workforce by 40 percent by off-loading documentation and reporting to machines,” he says.

But using ChatGPT as a therapist is a more complex matter. While some people may balk at the idea of spilling their secrets to a machine, LLMs can sometimes give better responses than many human users, says Tim Althoff, a computer scientist at the University of Washington. His group has studied how crisis counselors express empathy in text messages and trained LLM programs to give writers feedback based on strategies used by those who are the most effective at getting people out of crisis.

“There’s a lot more [to therapy] than putting this into ChatGPT and seeing what happens,” Althoff says. His group has been working with the nonprofit Mental Health America to develop a tool based on the algorithm that powers ChatGPT. Users type in their negative thoughts, and the program suggests ways they can reframe those specific thoughts into something positive. More than 50,000 people have used the tool so far, and Althoff says users are more than seven times more likely to complete the program than a similar one that gives canned responses.

Empathetic chatbots could also be helpful for peer support groups such as TalkLife and Koko, in which people without specialized training send other users helpful, uplifting messages. In a study published in Nature Machine Intelligence in January, when Althoff and his colleagues had peer supporters craft messages using an empathetic chatbot and found that nearly half the recipients preferred the texts written with the chatbot’s help over those written solely by humans and rated them as 20 percent more empathetic.

But having a human in the loop is still important. In an experiment that Koko co-founder Rob Morris described on Twitter, the company’s leaders found that users could often tell if responses came from a bot, and they disliked those responses once they knew the messages were AI-generated. (The experiment provoked a backlash online, but Morris says the app contained a note informing users that messages were partly written with AI.) It appears that “even though we’re sacrificing efficiency and quality, we prefer the messiness of human interactions that existed before,” Morris says.

Researchers and companies developing mental health chatbots insist that they are not trying to replace human therapists but rather to supplement them. After all, people can talk with a chatbot whenever they want, not just when they can get an appointment, says Woebot Health’s chief program officer Joseph Gallagher. That can speed the therapy process, and people can come to trust the bot. The bond, or therapeutic alliance, between a therapist and a client is thought to account for a large percentage of therapy’s effectiveness.

In a study of 36,000 users, researchers at Woebot Health, which does not use ChatGPT, found that users develop a trusting bond with the company’s chatbot within four days, based on a standard questionnaire used to measure therapeutic alliance, as compared with weeks with a human therapist. “We hear from people, ‘There’s no way I could have told this to a human,’” Gallagher says. “It lessens the stakes and decreases vulnerability.”

Risks of Outsourcing Care

But some experts worry the trust could backfire, especially if the chatbots aren’t accurate. A theory called automation bias suggests that people are more likely to trust advice from a machine than from a human—even if it’s wrong. “Even if it’s beautiful nonsense, people tend more to accept it,” says Evi-Anne van Dis, a clinical psychology researcher at Amsterdam UMC in the Netherlands.*

And chatbots are still limited in the quality of advice they can give. They may not pick up on information that a human would clock as indicative of a problem, such as a severely underweight person asking how to lose weight. Van Dis is concerned that AI programs will be biased against certain groups of people if the medical literature they were trained on—likely from wealthy, western countries—contains biases. They may miss cultural differences in the way mental illness is expressed or draw wrong conclusions based on how a user writes in that person’s second language.

The greatest concern is that chatbots could hurt users by suggesting that a person discontinue treatment, for instance, or even by advocating self-harm. In recent weeks the National Eating Disorders Association (NEDA) has come under fire for shutting down its helpline, previously staffed by humans, in favor of a chatbot called Tessa, which was not based on generative AI but instead gave scripted advice to users. According to social media posts by some users, Tessa sometimes gave weight-loss tips, which can be triggering to people with eating disorders. NEDA suspended the chatbot on May 30 and said in a statement that it is reviewing what happened. 

“In their current form, they’re not appropriate for clinical settings, where trust and accuracy are paramount,” says Ross Harper, -chief executive officer of Limbic, regarding AI chatbots that have not been adapted for medical purposes. He worries that mental health app developers who don’t modify the underlying algorithms to include good scientific and medical practices will inadvertently develop something harmful. “It could set the whole field back,” Harper says.

Chaitali Sinha, head of clinical development and research at Wysa, says that her industry is in a sort of limbo while governments figure out how to regulate AI programs like ChatGPT. “If you can’t regulate it, you can’t use it in clinical settings,” she says. Van Dis adds that the public knows little about how tech companies collect and use the information users feed into chatbots—raising concerns about potential confidentiality violations—or about how the chatbots were trained in the first place.

Limbic, which is testing a ChatGPT-based therapy app, is trying to address this by adding a separate program that limits ChatGPT’s responses to evidence-based therapy. Harper says that health regulators can evaluate and regulate this and similar “layer” programs as medical products, even if laws regarding the underlying AI program are still pending.

Wysa is currently applying to the U.S. Food and Drug Administration for its cognitive-behavioral-therapy-delivering chatbot to be approved as a medical device, which Sinha says could happen within a year. Wysa uses an AI that is not ChatGPT, but Sinha says the company may consider generative AIs once regulations become clearer.

Brown worries that without regulations in place, emotionally vulnerable users will be left to determine whether a chatbot is reliable, accurate and helpful. She is also concerned that for-profit chatbots will be primarily developed for the “worried well”—people who can afford therapy and app subscriptions—rather than isolated individuals who might be most at risk but don’t know how to seek help.

Ultimately, Insel says, the question is whether some therapy is better than none. “Therapy is best when there’s a deep connection, but that’s often not what happens for many people, and it’s hard to get high-quality care,” he says. It would be nearly impossible to train enough therapists to meet the demand, and partnerships between professionals and carefully developed chatbots could ease the burden immensely. “Getting an army of people empowered with these tools is the way out of this,” Insel says.

*Editor’s Note (5/26/23): This sentence was edited after posting to correct Evi-Anne van Dis’s current affiliation.