A CUTTING-EDGE approach that integrates electronic health records (EHR) with genome-wide association study (GWAS) data significantly improves the ability to predict the progression of autoimmune diseases from their preclinical stages.
Autoimmune diseases such as rheumatoid arthritis and systemic lupus erythematosus often show subtle signs before diagnosis, but predicting which individuals will go on to develop full-blown disease remains a challenge. Most biobanks hold limited case data, making it difficult to build reliable polygenic risk scores (PRS) using genetic information alone.
To address this, researchers from Penn State and collaborators developed a novel method called the Genetic Progression Score (GPS). This model leverages genetic insights from both EHR-linked biobanks and larger case-control GWAS datasets. Using penalised regression, GPS integrates existing PRS as priors, modifying them only when doing so improves prediction accuracy.
Simulation studies showed that GPS outperforms traditional methods, particularly when biobank sample sizes are small or when genetic correlation between progression and diagnosis is low. In real-world application, the team used GPS to model the risk of progression from preclinical rheumatoid arthritis and lupus using data from the BioVU biobank, validating the results in the All of Us cohort.
GPS consistently demonstrated the highest predictive accuracy and strongest correlation with actual disease progression rates, offering a promising tool for early identification and prevention strategies in autoimmune disease management.
Aleksandra Zurowska, EMJ
Reference
Wang C et al. Integrating electronic health records and GWAS summary statistics to predict the progression of autoimmune diseases from preclinical stages. Nat Commun. 2025;DOI: 10.1038/s41467-024-55636-6.