Researchers at Graz University of Technology are using virtual reality and large-scale language models to support social skills training in people with autism spectrum disorders. This system aims to make treatment options more widely available.
The number of people affected by autism spectrum disorder (ASD) is increasing worldwide. Research shows that 1 in 44 children is diagnosed with this disease. The central symptom is so-called “social blindness,” the inability to recognize the emotions of others and respond appropriately to social situations. Appropriate treatment is usually based on one-on-one or small group support, but this is only available to a limited extent and is expensive. Researchers at the Institute for Human-Centered Computing at the Graz University of Technology (TU Graz) are using computer gaming technology to develop effective supplements that are inexpensive and readily available. Early research shows that this approach can help people with ASD live more safely in their daily lives.
Everyday situations that have no social impact
The specially developed virtual environment Simville uses virtual reality, large-scale language models (LLM), speech recognition, and speech generation to make social training location independent and therefore more accessible to affected populations. In this computer world, users train in realistic, everyday situations, such as conversations with colleagues at work or meeting people at a cafe. This takes place in a controlled environment, allowing users to act freely without fear of social repercussions. These training scenarios prepare you for similar interactions in everyday life.
“Our system is not intended to replace conventional treatments, but to complement and enhance them in a meaningful way,” says Christian Pogritsch of the Institute for Human-Centered Computing at Graz University of Technology, who carried out the project as part of his doctoral thesis. An immersive yet playful approach is very important to Simville. Immediate feedback after performing tasks, storytelling, and scenes motivates participants to practice regularly. Furthermore, the number of stimuli that act on the user can be controlled, allowing beginners to start with a low number, increase the number of stimuli over time through training, and reduce them again if they become overwhelmed.
Language models convey emotions
By integrating LLM with speech recognition and generation, users can talk to their avatars normally in the game world. What is said is converted to text by a speech recognition system, and a large-scale language model creates contextual responses that the avatar responds to in spoken language. The team used Google’s model Gemini 12B to create and run the response.
What’s interesting is that models can also convey certain emotions. You can be sure to hear the right bass depending on what is being said. ”
Christian Poglitsch, Institute for Human-Centered Computing, Graz University of Technology
Subject feels safe
Early research shows that training with Simville has positive effects. A study of 25 participants showed that after just a few sessions, many felt much more confident in social situations. Simvir is currently part of the international ETAP project led by the University of Furtwangen. The simulation interface is combined with extensive sensor technology to increase or decrease the intensity of the experience based on user responses. In addition, the Game Lab Graz at the Graz University of Technology wants to make Simville available as a demonstrator so that affected people can train themselves.

