Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Your brain wasn’t designed for news this bad

    June 16, 2026

    Alien messages may be reaching Earth without us even noticing.

    June 16, 2026

    EPA’s PFAS withdrawal is a ‘slap in the face,’ says North Carolina advocate

    June 16, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Health Magazine
    • Home
    • Environmental Health
    • Health Technology
    • Medical Research
    • Mental Health
    • Nutrition Science
    • Pharma
    • Public Health
    • Discover
      • Daily Health Tips
      • Financial Health & Stability
      • Holistic Health & Wellness
      • Mental Health
      • Nutrition & Dietary Trends
      • Professional & Personal Growth
    • Our Mission
    Health Magazine
    Home » News » Research reveals limitations of large-scale language models in medical diagnosis
    Discover

    Research reveals limitations of large-scale language models in medical diagnosis

    healthadminBy healthadminMarch 17, 2026No Comments4 Mins Read
    Research reveals limitations of large-scale language models in medical diagnosis
    Share
    Facebook Twitter Reddit Telegram Pinterest Email



    Artificial intelligence (AI) is rapidly transforming healthcare. AI systems can now detect diabetic eye disease from retinal photographs and analyze CT images for signs of early-stage lung cancer or stroke.

    Now, in hospitals across the country and around the world, special algorithms are silently assisting doctors, prioritizing urgent scans and alerting them to subtle abnormalities that may go unnoticed. These specialized AI tools are often trained on millions of accurately classified medical images and are increasingly being integrated into real-world clinical settings.

    At the same time, another form of AI, large-scale language models (LLMs), is gaining public attention. Widely accessible systems such as ChatGPT and Claude can analyze both text and images. In theory, these capabilities should be suitable for medical tasks, but can general-purpose AI platforms be trusted when it comes to medical diagnosis?

    A new study led by Milan Thoma, Ph.D., associate professor at the New York Institute of Technology College of Osteopathic Medicine (NYITCOM), suggests otherwise. as seen in academic journals algorithmToma and co-authors, including NYITCOM senior development security operations engineer Mihir Matalia and medical student Sungjoon Hon, tested the reliability of the world’s most advanced multimodal LLMSs: GPT-5, Gemini 3 Pro, Llama 4 Maverick, Grok4, and Claude Opus 4.5 Extended.

    The researchers provided each AI model with the same CT brain scan that showed obvious intracranial pathology. The models were then asked to analyze the images like radiologists to identify the imaging technique used, the location of the lesion in the brain, the primary diagnosis, key features, and potential alternative diagnoses. Overall, the findings reveal a basic diagnostic error rate of 20% across AI models and concerning variability in interpretation and assessment.

    Initially, the models yielded promising results, with all five correctly identifying the images as CT brain scans. The four models also detected an important finding: ischemic stroke near the left middle cerebral artery. However, some people made the fundamental mistake of misclassifying a stroke as a hemorrhage on the opposite side of the brain. In actual clinical practice, this error can have a significant impact on patient health, as ischemic stroke and hemorrhagic stroke require different treatments.

    Even among the four AI models that came up with a correct diagnosis, the explanations were very different. Some people offer different interpretations of when the stroke first occurred. Others did not agree on a different diagnosis or additional brain areas affected or calcifications. Next, the researchers introduced a novel surprise. We asked each AI model to score the diagnostic descriptions of other AI models. This cross-evaluation revealed further discrepancies, with some models being evaluated more harshly than others. One model even believed that this finding indicated a chronic brain abnormality rather than an acute stroke, and therefore systematically deducted points from other models’ responses.

    In recent years, Toma has published more than 30 peer-reviewed studies on AI in medical diagnostics and healthcare and two books on the subject.

    Our research highlights important differences in the AI ​​landscape. Most successful medical AI tools are task-specific algorithms, trained on large datasets of labeled medical images and validated against very specific diagnostic tasks. However, large-scale language models are not optimized for diagnostics and are built for linguistics and conversation. Therefore, they produce explanations that sound authoritative, even if their underlying interpretations are wrong or contradictory. ”

    Dr. Milan Thoma, Associate Professor, New York Institute of Technology College of Osteopathic Medicine (NYITCOM)

    Toma and his co-authors conclude that the future of healthcare AI is likely to combine both specialized diagnostic systems and language models. However, while LLM is useful for clinical documentation, summarizing reports, or communicating with patients, oversight by a medical professional remains non-negotiable for all diagnostic interpretations.

    sauce:

    New York Institute of Technology

    Reference magazines:

    Hon, S. Others. (2026). Chat is not diagnosis: Diagnostic variability and fundamental errors in multimodal LLM interpretation in radiology. algorithm. DOI: 10.3390/a19030170. https://www.mdpi.com/1999-4893/19/3/170



    Source link

    Visited 15 times, 1 visit(s) today
    Share. Facebook Twitter Pinterest LinkedIn Telegram Reddit Email
    Previous ArticleStudy reveals dual role of PFK enzyme in metabolism and cell cycle progression
    Next Article Despite safety concerns, parents underestimate the risks of teen driving
    healthadmin

    Related Posts

    Women track nighttime disturbances more accurately than men, new data shows

    June 16, 2026

    Positive-aging TikTok videos boost women’s confidence, U.S. study finds

    June 16, 2026

    Personal characteristics and cultural factors change how the brain processes musical emotions

    June 16, 2026

    Social media overuse is associated with increased self-reported memory problems in young people

    June 16, 2026

    NHS dental costs reach £5.3bn as older people shoulder the burden

    June 16, 2026

    Wastewater analysis offers a new approach to tracking HIV burden

    June 15, 2026
    Add A Comment

    Comments are closed.

    Categories

    • Daily Health Tips
    • Discover
    • Environmental Health
    • Exercise & Fitness
    • Featured
    • Featured Videos
    • Financial Health & Stability
    • Fitness
    • Fitness Updates
    • Health
    • Health Technology
    • Healthy Aging
    • Healthy Living
    • Holistic Healing
    • Holistic Health & Wellness
    • Medical Research
    • Medical Research & Insights
    • Mental Health
    • Mental Wellness
    • Natural Remedies
    • New Workouts
    • Nutrition
    • Nutrition & Dietary Trends
    • Nutrition & Superfoods
    • Nutrition Science
    • Pharma
    • Preventive Healthcare
    • Professional & Personal Growth
    • Public Health
    • Public Health & Awareness
    • Selected
    • Sleep & Recovery
    • Top Programs
    • Weight Management
    • Workouts
    Popular Posts
    • 1773313737_bacteria_-_Sebastian_Kaulitzki_46826fb7971649bfaca04a9b4cef3309-620x480.jpgHow Sino Biological ProPure™ redefines ultra-low… March 12, 2026
    • pexels-david-bartus-442116The food industry needs to act now to cut greenhouse… January 2, 2022
    • 1773729862_TagImage-3347-458389964760995353448-620x480.jpgDespite safety concerns, parents underestimate the… March 17, 2026
    • 1774403998_image_28620e4b6b0047f7ab9154b41d739db1-620x480.jpgGait pattern helps distinguish between Lewy body… March 24, 2026
    • 1773209206_futuristic_techno_design_on_background_of_supercomputer_data_center_-_Image_-_Timofeev_Vladimir_M1_4.jpegMulti-agent AI systems outperform single models… March 11, 2026
    • the-pros-and-cons-of-paleo-dietsThe Pros and Cons of Paleo Diets: What Science Really Says April 16, 2025

    Demo
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss

    Your brain wasn’t designed for news this bad

    By healthadminJune 16, 2026

    In a recent conversation, someone told me that they have stopped checking their phones in…

    Alien messages may be reaching Earth without us even noticing.

    June 16, 2026

    EPA’s PFAS withdrawal is a ‘slap in the face,’ says North Carolina advocate

    June 16, 2026

    Women track nighttime disturbances more accurately than men, new data shows

    June 16, 2026

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    HealthxMagazine
    HealthxMagazine

    At HealthX Magazine, we are dedicated to empowering entrepreneurs, doctors, chiropractors, healthcare professionals, personal trainers, executives, thought leaders, and anyone striving for optimal health.

    Our Picks

    Women track nighttime disturbances more accurately than men, new data shows

    June 16, 2026

    Copper drug removes toxic Alzheimer’s proteins and restores memory

    June 16, 2026

    Positive-aging TikTok videos boost women’s confidence, U.S. study finds

    June 16, 2026
    New Comments
      Facebook X (Twitter) Instagram Pinterest
      • Home
      • Privacy Policy
      • Our Mission
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.