Researchers develop open-source framework for health AI research

A research team led by Columbia University has developed an open-source framework designed to streamline and accelerate artificial intelligence research using medical data, addressing long-standing challenges in data standardization, reproducibility, and interinstitutional collaboration.

This framework, called MEDS, introduces both a standardized data format and a growing ecosystem of interoperable tools aimed at supporting the development and evaluation of machine learning models using clinical data.

A study describing this framework was published in NEJM AI.

Researchers say the framework could help alleviate technical barriers that currently slow health AI research and make it difficult for scientists to reproduce research findings or compare models across studies and institutions.

MEDS is an easy way to make all different sources of electronic health record (EHR) data look the same to your code, regardless of which hospital, clinic, or EHR software system the data comes from. MEDS allows the sharing of code that can be used to train models in different clinical settings, without having to share sensitive patient data, and often without even having to take the more difficult step of fully “harmonizing” the data into a consistent clinical vocabulary. This infrastructure allows researchers to spend less time rebuilding pipelines and more time answering clinically meaningful questions. ”

Dr. Matthew McDermott, Assistant Professor of Biomedical Informatics and Research Leader, Columbia University

Standardizing health data for clinical AI research

Electronic health record data is often stored in facility-specific formats that require extensive pre-processing before being used for AI development. According to the study authors, these inconsistencies can result in significant duplication of effort, limit collaboration, and impede reproducibility.

MEDS addresses these issues by providing a lightweight, extensible standard for representing longitudinal clinical data in machine learning workflows. The framework also includes open-source tools that support data transformation, preprocessing, benchmarking, and model development.

The authors emphasize that MEDS is specifically designed for AI and machine learning applications and complements, rather than replaces, existing clinical data standards.

This framework aims to support a wide range of use cases in biomedical AI research, including predictive modeling, representation learning, multimodal modeling, and large-scale benchmark studies. The ecosystem is open source, allowing researchers in academia, healthcare, and industry to contribute tools and extensions.

“Great success in AI has always been driven by the ability of communities to come together and collaborate in a decentralized, open-source fashion around tools, model parts, and ultimately an ecosystem that allows us to build bigger models that scale to large datasets,” McDermott said. “These impressive results in MEDS simply reflect the benefits that the community can gain by sharing tools and abstracting common parts of pipelines into shared libraries that can be used with everyone’s data.”

This study also highlights the importance of reproducibility and transparency in medical AI development, as machine learning models increasingly move towards clinical deployment.

Researchers say they hope MEDS will foster broader collaboration across institutions and accelerate innovation in clinical AI, while promoting more transparent and reproducible science. MEDS has already been adopted by 21 institutions in 12 countries.

sauce:

Columbia University Irving Medical Center

Reference magazines:

McDermott, MBA; Others. (2026). MEDS — Emerging data standards and ecosystem for health AI research. NEJM AI. DOI: 10.1056/AIra2501253. https://ai.nejm.org/doi/10.1056/AIra2501253

Source link

Visited 8 times, 1 visit(s) today

What's Hot

Quantum breakthrough links light and magnetism in thin materials at the atomic level

More than 400 people sickened as CDC investigates cause of mysterious outbreak

Limited attention can make rare events more likely to occur over time

Researchers develop open-source framework for health AI research

Limited attention can make rare events more likely to occur over time

Extending myeloma maintenance therapy does not show additional survival benefit

Scientists develop ultra-thin skin sensor for seamless health tracking

New drug candidate improves treatment of Parkinson’s disease in animal models

Treating loneliness as a medical problem impedes social responsibility

Brain immune cells play a critical role in maintaining neural network stability in Alzheimer’s disease

Quantum breakthrough links light and magnetism in thin materials at the atomic level

More than 400 people sickened as CDC investigates cause of mysterious outbreak

Limited attention can make rare events more likely to occur over time

NASA’s James Webb Space Telescope reveals strange atmosphere of hellish lava planet

Our Picks

NASA’s James Webb Space Telescope reveals strange atmosphere of hellish lava planet

Extending myeloma maintenance therapy does not show additional survival benefit

Common laxatives may help clear depression’s brain fog

Subscribe to Updates

What's Hot

Researchers develop open-source framework for health AI research

Standardizing health data for clinical AI research

Related Posts