A new study was published on March 24th. RadiologyBoth radiologists and multimodal large-scale language models (LLMs) have shown difficulty distinguishing between real X-rays and “deepfake” images generated by artificial intelligence (AI), according to the Journal of the Radiological Society of North America (RSNA). The findings raise concerns about the risks posed by synthetic medical images and highlight the need for better tools and training to protect the accuracy of medical images and help medical professionals recognize deepfakes.
A “deepfake” is a video, photo, image, or audio file that looks real but is created or altered using AI.
“Our study demonstrated that these deepfake radiographs were realistic enough to fool radiologists, the most highly trained medical imaging experts, even when they were aware that AI-generated images existed,” said study lead author Michael Tordjman, MD, a postdoctoral fellow at the Icahn School of Medicine at Mount Sinai, New York. “This increases the risk of fraud lawsuits if, for example, a fabricated fracture cannot be distinguished from a real one. It also poses significant cybersecurity risks if hackers gain access to a hospital’s network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical disruption by undermining the fundamental trustworthiness of digital medical records.”
Study details and image tests
The study involved 17 radiologists from 12 institutions in six countries (United States, France, Germany, Turkey, United Kingdom, and United Arab Emirates). Their experience ranges from beginners to specialists with up to 40 years of work experience. A total of 264 X-ray images were examined in this study, evenly divided between real scan images and images generated by AI.
Participants reviewed two separate sets of images with no overlap. One set had a mix of real images and ChatGPT-generated X-ray images from different parts of the body. The second set focused on chest X-rays, half real and half created using RoentGen, an open-source generative AI diffusion model developed by researchers at Stanford Medicine.
Detection accuracy for radiologists and AI
When radiologists were not informed that the images contained fake images, only 41% recognized the AI-generated X-rays after evaluating their technical quality. Once they were informed that a synthetic image existed, their average accuracy in distinguishing between real and fake rose to 75%.
There were large individual differences in performance. Radiologists correctly identified 58% to 92% of images generated by ChatGPT. Similar limitations were found in AI systems. Four multimodal LLMs, GPT-4o (OpenAI), GPT-5 (OpenAI), Gemini 2.5 Pro (Google), and Llama 4 Maverick (Meta), achieved accuracy rates between 57% and 85%. Even ChatGPT-4o, which was used to generate deepfake images, performed better than other models but could not detect all of the deepfake images.
For RoentGen-generated chest X-rays, radiologists achieved an accuracy of 62% to 78%, while the AI model had an accuracy of 52% to 89%.
Experience does not guarantee detection
The study found no association between radiologists’ years of experience and their ability to identify false X-rays. However, musculoskeletal radiologists performed significantly better than other subspecialists.
Visual clues in deepfake X-rays
Researchers have identified several patterns that can appear in synthetic images.
“Deepfake medical images often look too perfect,” says Dr. Torjman. “The bones are overly smooth, the spine is unnaturally straight, the lungs are overly symmetrical, the pattern of blood vessels is overly uniform, and fractures appear unusually clean and consistent, but are often confined to one side of the bone.”
Medical image processing risks and safety measures
The results highlight the serious risks if deepfake X-rays are misused. Fabricated images can be used in lawsuits or inserted into hospital systems to influence diagnosis or disrupt treatment.
To mitigate these threats, researchers recommend stronger digital protection. These include invisible watermarks embedded directly into the image and cryptographic signatures associated with the technician at the time of image capture, helping to verify authenticity.
The future of AI in medical image processing
“We’re potentially only seeing the tip of the iceberg,” says Dr. Tordjman. “The next logical step in this evolution is AI generation of synthetic 3D images such as CT and MRI. Establishing educational datasets and detection tools is critical now.”
To aid in education and awareness, researchers have released a curated deepfake dataset containing interactive quizzes for training purposes.

