Researchers have built an AI system that predicts your risk of developing more than 1,000 diseases up to 20 years before symptoms appear, according to a study published in Nature this week.
The model, called Delphi-2M, achieved 76% accuracy for near-term health predictions and maintained 70% accuracy even when forecasting a decade into the future.
It outperformed existing single-disease risk calculators while simultaneously assessing risks across the entire spectrum of human illness.
“The progression of human disease across age is characterized by periods of health, episodes of acute illness and also chronic debilitation, often manifesting as clusters of co-morbidity,” the researchers wrote. “Few algorithms are capable of predicting the full spectrum of human disease, which recognizes more than 1,000 diagnoses at the top level of the International Classification of Diseases, Tenth Revision (ICD-10) coding system.”
The system learned these patterns from 402,799 UK Biobank participants, then proved its mettle on 1.9 million Danish health records without any additional training.
Harvard’s New AI Tool Could Pinpoint Treatments for Parkinson’s and Alzheimer’s
Before you start rubbing your hands with the idea of your own medical predictor, can you try Delphi-2M yourself? Not exactly.
The trained model and its weights are locked behind UK Biobank’s controlled access procedures—meaning researchers only. The codebase for training your own version is on GitHub under an MIT license, so you could technically build your own model, but you’d need access to massive medical datasets to make it work.
For now, this remains a research tool, not a consumer app.
The technology works by treating medical histories as sequences—much like ChatGPT processes text.
Each diagnosis, recorded with the age it first occurred, becomes a token. The model reads this medical “language” and predicts what comes next.
With the proper information and training, you can predict the next token (in this case, the next illness) and the estimated time before that “token” is generated (how long until you get sick if the most likely set of events occurs).
For a 60-year-old with diabetes and high blood pressure, Delphi-2M might forecast a 19-fold increased risk of pancreatic cancer. Add a pancreatic cancer diagnosis to that history, and the model calculates mortality risk jumping nearly ten thousandfold.
The transformer architecture behind Delphi-2M represents each person’s health journey as a timeline of diagnostic codes, lifestyle factors like smoking and BMI, and demographic data. “No event” padding tokens fill the gaps between medical visits, teaching the model that the simple passage of time changes baseline risk.
Researchers have built an AI system that predicts your risk of developing more than 1,000 diseases up to 20 years before symptoms appear, according to a study published in Nature this week.
The model, called Delphi-2M, achieved 76% accuracy for near-term health predictions and maintained 70% accuracy even when forecasting a decade into the future.
It outperformed existing single-disease risk calculators while simultaneously assessing risks across the entire spectrum of human illness.
“The progression of human disease across age is characterized by periods of health, episodes of acute illness and also chronic debilitation, often manifesting as clusters of co-morbidity,” the researchers wrote. “Few algorithms are capable of predicting the full spectrum of human disease, which recognizes more than 1,000 diagnoses at the top level of the International Classification of Diseases, Tenth Revision (ICD-10) coding system.”
The system learned these patterns from 402,799 UK Biobank participants, then proved its mettle on 1.9 million Danish health records without any additional training.
Harvard’s New AI Tool Could Pinpoint Treatments for Parkinson’s and Alzheimer’s
Before you start rubbing your hands with the idea of your own medical predictor, can you try Delphi-2M yourself? Not exactly.
The trained model and its weights are locked behind UK Biobank’s controlled access procedures—meaning researchers only. The codebase for training your own version is on GitHub under an MIT license, so you could technically build your own model, but you’d need access to massive medical datasets to make it work.
For now, this remains a research tool, not a consumer app.
The technology works by treating medical histories as sequences—much like ChatGPT processes text.
Each diagnosis, recorded with the age it first occurred, becomes a token. The model reads this medical “language” and predicts what comes next.
With the proper information and training, you can predict the next token (in this case, the next illness) and the estimated time before that “token” is generated (how long until you get sick if the most likely set of events occurs).
For a 60-year-old with diabetes and high blood pressure, Delphi-2M might forecast a 19-fold increased risk of pancreatic cancer. Add a pancreatic cancer diagnosis to that history, and the model calculates mortality risk jumping nearly ten thousandfold.
The transformer architecture behind Delphi-2M represents each person’s health journey as a timeline of diagnostic codes, lifestyle factors like smoking and BMI, and demographic data. “No event” padding tokens fill the gaps between medical visits, teaching the model that the simple passage of time changes baseline risk.
Leave feedback about this