Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    MIT scientists discover amino acids that help the gut heal itself

    May 21, 2026

    Modern AI is often judged to be more human-like than actual humans in Turing Test experiments.

    May 21, 2026

    MetroHealth leverages Artisight for smart hospital technology

    May 21, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Health Magazine
    • Home
    • Environmental Health
    • Health Technology
    • Medical Research
    • Mental Health
    • Nutrition Science
    • Pharma
    • Public Health
    • Discover
      • Daily Health Tips
      • Financial Health & Stability
      • Holistic Health & Wellness
      • Mental Health
      • Nutrition & Dietary Trends
      • Professional & Personal Growth
    • Our Mission
    Health Magazine
    Home » News » Modern AI is often judged to be more human-like than actual humans in Turing Test experiments.
    Mental Health

    Modern AI is often judged to be more human-like than actual humans in Turing Test experiments.

    healthadminBy healthadminMay 21, 2026No Comments8 Mins Read
    Modern AI is often judged to be more human-like than actual humans in Turing Test experiments.
    Share
    Facebook Twitter Reddit Telegram Pinterest Email


    Recent research is Proceedings of the National Academy of Sciences We provide evidence that certain modern artificial intelligence systems can pass the standard Turing test. When instructed to adopt specific human personas, these computer programs tricked human judges into thinking they were real people more than half the time. This discovery provides the first empirical evidence that modern systems can pass this key scientific benchmark, raising deep questions about the future of online communications.

    To fully understand this research, it helps to know a little about large-scale language models (LLMs). These are highly complex computer programs trained on vast amounts of text data collected from the Internet. These power the popular AI chatbots that many people use today to compose emails, brainstorm ideas, and code software.

    Large-scale language models learn statistical patterns in human language to predict the next word in a sequence. This allows us to generate incredibly natural-sounding text in response to user questions.

    The researchers who conducted the study, Cameron R. Jones and Benjamin K. Bergen, wanted to see how well these modern models could handle a classical evaluation known as the Turing test. This theoretical game, originally proposed by British mathematician Alan Turing in 1950, provides a way to assess whether a machine can imitate human speech so well that it is completely indistinguishable from a real human.

    In the standard three-way version of the test, a human judge converses with two hidden participants at exactly the same time using a text chat interface. One of these hidden participants is a real human and the other is a computer program. If the human judge cannot reliably guess which participant is the machine, the computer is said to have passed the test.

    Jones and Bergen began this study because standard assessments of artificial intelligence tend to be static and narrow. High scores on common benchmark tests may simply reflect memorization of training data rather than adaptability to real-life conversations. The traditional Turing test provides a flexible, interactive environment in which human judges can ask open-ended questions and probe the machine for perceived weaknesses.

    “The Turing test started as a way to ask whether machines could match human intelligence,” said Bergen, a cognitive science professor at the University of California, San Diego. “But we now know that AI can answer many questions faster and more accurately than humans, so the real problem is not one of raw brains.”

    “Seeing that a machine can pass a test, and how a machine passes a test, forces you to rethink what it’s measuring,” Bergen says. “Increasingly, we are measuring our humanity.”

    To explore these questions, the scientists set up a series of highly controlled online experiments involving nearly 500 participants in total. They evaluated several different artificial intelligence systems, including GPT-4.5, LLaMa-3.1-405B, and GPT-4o. The scientists also included an older and much simpler chatbot program from the 1960s called ELIZA as a baseline.

    In the first study, researchers recruited 126 undergraduate students from the University of California, San Diego. The second study recruited 158 participants through an online survey platform called Prolific. Human participants were randomly assigned to play the role of judge or human interlocutor.

    To conduct the study, the researchers built an online interface designed to feel like a familiar messaging app. “For the interrogator, there’s a split screen on the computer and he’s questioning both witnesses,” Jones said. He conducted his research while earning his PhD in cognitive science at the University of California, San Diego, and is currently an assistant professor of psychology at Stony Brook University. “They know that one of the witnesses is human and one of them is AI.”

    The judge had exactly five minutes to speak with both participants. When the time was up, the judges had to decide which conversation partners were real people, rate their trustworthiness, and explain why.

    The researchers tested two different sets of instructions, known as prompts, on the artificial intelligence model. The first prompt simply told the computer that it was participating in a Turing test. The second was a persona prompt, which gave the computer a detailed character to play and asked it to act like a young introvert familiar with internet culture and modern slang.

    Persona prompts made a big difference in the results. GPT-4.5 was identified as human 73% of the time. This means that they fooled the judges far more often than the actual human participants. When the LLaMa-3.1-405B model was given the same prompt, it identified a human 56% of the time.

    “What we found is that given the right prompts, advanced LLMs can display the same tone, directness, humor, and fallibility as humans,” Jones says. “We know that LLM can easily generate knowledge on almost any topic, but in this test we also showed that LLM can convincingly display social behavioral traits. This has major implications for the way we think about AI.”

    However, without specific persona prompts, the model performed much worse. Without explicit instructions, GPT-4.5’s win rate dropped to 36%, and LLaMa-3.1-405B’s win rate dropped to 38%. The old baseline system performed poorly overall, with ELIZA selected as human only 23 percent of the time and GPT-4o selected 21 percent of the time.

    “They have the ability to look human-like, but they probably have less of an ability to understand what it takes to look human-like,” Bergen said.

    The researchers then conducted a replication study to see if longer conversations helped human judges accurately identify machines. We recruited 205 new participants from Prolific and extended the chat time limit to 15 minutes. They used GPT-5 with the LLaMa-3.1-405B model and ELIZA.

    Over time, human judges still had trouble identifying machines using persona prompts. The GPT-5 model was identified as human 59 percent of the time, and LLaMa-3.1-405B was selected 56 percent of the time. Human judges easily identified the simple ELIZA program, guessing that it was written by a human only 17% of the time.

    The researchers analyzed the specific conversational strategies that human judges used to determine who was genuine. Most of the judges tried to make small talk and asked about daily activities, personal opinions, and emotional experiences. Judges tended to assume that participants were human if they made minor spelling errors, appeared to lack knowledge about a particular topic, or responded directly without sounding overly formal.

    “These traits are not the kind of intelligence Turing envisioned for solving mathematical and logical problems,” Bergen said.

    Additionally, the scientists found that university students performed slightly better than online participants. This suggests that students shared more commonalities, such as details of their local campuses, which may have helped them explore each other more effectively.

    The authors caution against misinterpreting the meaning of the results. Just because a machine passes the Turing test doesn’t mean it has true human intelligence or consciousness. Rather, it suggests that this machine is very good at conforming to human expectations about how other people chat online.

    This study also has obvious limitations. The high success rate of the large-scale language model depended entirely on the specific persona prompts provided by the researchers. Without these detailed instructions, the model is unable to consistently fool judges, showing that it still requires human guidance to behave in a convincing and human way.

    Future research could investigate how different types of judges perform on this classic test. Scientists might test whether computer science experts are better at spotting artificial intelligence than the average person. Researchers might also consider whether everyday humans can be trained to recognize machine-generated text over long periods of time.

    This finding has real-world implications for online trust. “It’s relatively easy to make these models indistinguishable from humans,” Jones says. “We need to be more vigilant. When interacting with strangers online, people should be less confident that they are talking to a human being and not an LLM.”

    “The Turing test is a game of lying for the sake of the model,” Jones said. “One of the implications of that is that the model seems to be very good at it.”

    Not being able to tell whether you’re interacting with a human or a bot can have serious implications for everyday people. “There are a lot of people who want to use bots to persuade people to share their Social Security numbers to vote for their party or buy their products,” Bergen said.

    The study “Large-scale language models pass the standard three-way Turing test” was authored by Cameron R. Jones and Benjamin K. Bergen.



    Source link

    Visited 3 times, 3 visit(s) today
    Share. Facebook Twitter Pinterest LinkedIn Telegram Reddit Email
    Previous ArticleMetroHealth leverages Artisight for smart hospital technology
    Next Article MIT scientists discover amino acids that help the gut heal itself
    healthadmin

    Related Posts

    Ultra-processed foods are linked to decreased alertness and increased risk of dementia, even if your diet is healthy

    May 21, 2026

    Depression appears to change how young people remember childhood trauma and adversity

    May 21, 2026

    Does tuning your music to 432Hz really heal you? Scientists explain viral trends

    May 20, 2026

    Fear of missing out is linked to the brain’s hyperresponse to digital likes

    May 20, 2026

    Using sex toys with a younger partner is associated with less severe menopausal symptoms

    May 20, 2026

    Can intestinal bacteria cause postpartum depression?

    May 20, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Categories

    • Daily Health Tips
    • Discover
    • Environmental Health
    • Exercise & Fitness
    • Featured
    • Featured Videos
    • Financial Health & Stability
    • Fitness
    • Fitness Updates
    • Health
    • Health Technology
    • Healthy Aging
    • Healthy Living
    • Holistic Healing
    • Holistic Health & Wellness
    • Medical Research
    • Medical Research & Insights
    • Mental Health
    • Mental Wellness
    • Natural Remedies
    • New Workouts
    • Nutrition
    • Nutrition & Dietary Trends
    • Nutrition & Superfoods
    • Nutrition Science
    • Pharma
    • Preventive Healthcare
    • Professional & Personal Growth
    • Public Health
    • Public Health & Awareness
    • Selected
    • Sleep & Recovery
    • Top Programs
    • Weight Management
    • Workouts
    Popular Posts
    • 1773313737_bacteria_-_Sebastian_Kaulitzki_46826fb7971649bfaca04a9b4cef3309-620x480.jpgHow Sino Biological ProPure™ redefines ultra-low… March 12, 2026
    • the-pros-and-cons-of-paleo-dietsThe Pros and Cons of Paleo Diets: What Science Really Says April 16, 2025
    • pexels-david-bartus-442116The food industry needs to act now to cut greenhouse… January 2, 2022
    • 1773729862_TagImage-3347-458389964760995353448-620x480.jpgDespite safety concerns, parents underestimate the… March 17, 2026
    • 1773209206_futuristic_techno_design_on_background_of_supercomputer_data_center_-_Image_-_Timofeev_Vladimir_M1_4.jpegMulti-agent AI systems outperform single models… March 11, 2026
    • 1774403998_image_28620e4b6b0047f7ab9154b41d739db1-620x480.jpgGait pattern helps distinguish between Lewy body… March 24, 2026

    Demo
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss

    MIT scientists discover amino acids that help the gut heal itself

    By healthadminMay 21, 2026

    MIT researchers have discovered that naturally occurring amino acids found in many protein-rich foods may…

    Modern AI is often judged to be more human-like than actual humans in Turing Test experiments.

    May 21, 2026

    MetroHealth leverages Artisight for smart hospital technology

    May 21, 2026

    Scientists discover strange ‘inside-out’ planetary system that shouldn’t exist

    May 21, 2026

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    HealthxMagazine
    HealthxMagazine

    At HealthX Magazine, we are dedicated to empowering entrepreneurs, doctors, chiropractors, healthcare professionals, personal trainers, executives, thought leaders, and anyone striving for optimal health.

    Our Picks

    Scientists discover strange ‘inside-out’ planetary system that shouldn’t exist

    May 21, 2026

    Duke University plans data center to increase ‘environmental responsibility and sustainability’

    May 21, 2026

    Scientists discover nutrients that can supercharge cellular energy

    May 21, 2026
    New Comments
      Facebook X (Twitter) Instagram Pinterest
      • Home
      • Privacy Policy
      • Our Mission
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.