BACKGROUND AND AIMS
The prevalence of Type 2 diabetes mellitus (T2DM) is expected to increase rapidly in the next decades, posing a major challenge to societies worldwide. The emerging era of precision medicine calls for the discovery of biomarkers of clinical value for prediction of T2DM, so that causal biomarkers can suggest novel therapeutic targets. However, only fragmentary data are currently available for protein biomarkers for prediction of incident T2DM.1 The aim of the current study was to utilise deep serum proteomics data to identify biomarkers for prevalent and incident T2DM and evaluate their predictive value over clinical traits. Furthermore, genetic information was integrated to evaluate the causal relationships between serum proteins and T2DM.
MATERIALS AND METHODS
Serum levels of 4,137 human proteins were measured with multiplex SOMAmer technology (SomaLogic, Inc., Boulder, Colorado, USA) in the population-based AGES cohort of 5,438 Icelanders as previously described,2 of which 654 had prevalent T2DM. Of the 2,940 individuals free of diabetes at baseline who participated in a 5-year follow-up visit, 112 developed T2DM within the study period. Protein associations with prevalent or incident T2DM were evaluated with logistic regression adjusting for age and sex, and considered significant when the Bonferroni-corrected p-value <0.05. LASSO penalised logistic regression analysis combined with bootstrap resampling was applied to prioritise a panel of proteins to predict incident T2DM and compared with a clinical model using variables from the Framingham Offspring Risk Score.3 The prediction model was evaluated in a validation sample consisting of 1,844 AGES participants who did not participate in the 5-year follow-up visit but among which 46 incident T2DM cases were defined from linked prescription and medical records. A two-sample Mendelian randomisation (MR) analysis was performed to identify causal candidates for T2DM. Here, genetic instruments for protein levels identified in AGES were integrated with genome wide association study summary statistics for T2DM from the DIAMANTE consortium4 and a Benjamini-Hochberg FDR <0.05 in the MR analysis was considered significant.
RESULTS
The study identified 520 and 99 proteins associated with prevalent or incident T2DM, respectively, where 83 proteins were overlapping (Fisher’s p: 7.2×10-63). Proteins associated with prevalent T2DM were enriched for extracellular matrix-receptor interaction, complement and coagulation cascades, metabolic processes, and liver-specific gene expression. By contrast, proteins associated with incident T2DM were mainly enriched for metabolism, lipid transport, and response to insulin, as well as gene expression in liver and adipose tissue, supporting the involvement of these pathways in the preclinical phase of the disease.
Using LASSO analysis, a panel of 20 protein biomarkers was identified that together with clinical risk factors predicted incident T2DM in the validation sample with an AUC of 0.84 (95% confidence interval [CI]: 0.78–0.91), which was a significant (p=6.6×10-3) improvement over the clinical model alone (AUC=0.80, 95% CI: 0.72–0.88). Of 536 proteins associated with either prevalent or incident T2DM, genetic instruments were identified for 246 proteins in AGES, of which 16 were supported (FDR<0.05) as having a causal effect on T2DM. Here, the strongest support for causality was observed for the proteins MMP12, HIBCH, and WFIKKN2.
CONCLUSION
These results demonstrate a major shift in the serum proteome before and during the diabetic stage. The proteomic changes observed in the preclinical stage of the disease were mainly related to insulin sensitivity. A multivariate model with serum proteins adds significantly to the prediction of T2DM over traditional clinical risk factors, although our findings require replication in an independent cohort and further evaluation of any clinical utility. Finally, the MR analysis highlighted a number of proteins that may have a causal role in the development of the disease. These proteins could be of particular interest for follow-up studies as novel therapeutic targets.