Improving prediction models of amyotrophic lateral sclerosis (ALS) using polygenic, pre-existing conditions, and survey-based risk scores in the UK Biobank
Weijia Jin Jonathan Boss Bhramar Mukherjee
Background and objectives Amyotrophic lateral sclerosis (ALS) causes profound impairments in neurological function, and a cure for this devastating disease remains elusive. This study aimed to identify pre-disposing genetic, phenotypic, and exposure-related factors for amyotrophic lateral sclerosis using multi-modal data and assess their joint predictive potential.
Methods Utilizing data from the UK (United Kingdom) Biobank, we analyzed an unrelated set of 292 ALS cases and 408,831 controls of European descent. Two polygenic risk scores (PRS) are constructed: “GWAS Hits PRS” and “PRS-CS,” reflecting oligogenic and polygenic ALS risk profiles, respectively. Time-restricted phenome-wide association studies (PheWAS) were performed to identify pre-existing conditions increasing ALS risk, integrated into phenotypic risk scores (PheRS). A poly-exposure score (“PXS”) captures the influence of environmental exposures measured through survey questionnaires. We evaluate the performance of these scores for predicting ALS incidence and stratifying risk, adjusting for baseline demographic covariates.
Results Both PRSs modestly predicted ALS diagnosis but with increased predictive power when combined (covariate-adjusted receiver operating characteristic [AAUC]=0.584 [0.525, 0.639]). PheRS incorporated diagnoses 1 year before ALS onset (PheRS1) modestly discriminated cases from controls (AAUC=0.515 [0.472, 0.564]). The “PXS” did not significantly predict ALS. However, a model incorporating PRSs and PheRS1 improved the prediction of ALS (AAUC=0.604 [0.547, 0.667]), outperforming a model combining all risk scores. This combined risk score identified the top 10% of risk score distribution with a fourfold higher ALS risk (95% CI [2.04, 7.73]) versus those in the 40%–60% range.
Discussion By leveraging UK Biobank data, our study uncovers pre-disposing ALS factors, highlighting the improved effectiveness of multi-factorial prediction models to identify individuals at highest risk for ALS.