Title | Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts |
---|
Type | Journal article |
---|
Authors | Atabaki-Pasdar, N., Ohlsson, M., Viñuela, A., Frau, F., Pomares-Millan, H., Haid, M., Jones, A., Thomas, E.L., Koivula, R., Kurbasic, A., Mutie, P., Fitipaldi, H., Fernandez, J., Dawed, A., Giordano, G., Forgie, I., McDonald, T., Rutters, F., Cederberg, H., Chabanova, E., Dale, M., Masi, F., Thomas, C., Allin, K., Hansen, T., Heggie, A., Hong, M., Elders, P., Kennedy, G., Kokkola, T., Pedersen, H., Mahajan, A., McEvoy, D., Pattou, F., Raverdy, V., Häussler, R., Sharma, S., Thomsen, H., Vangipurapu, J., Vestergaard, H., ‘t Hart, L., Adamski, J., Musholt, P., Brage, S., Brunak, S., Dermitzakis, E., Frost, G., Hansen, T., Laakso, M., Pedersen, O., Ridderstråle, M., Ruetten, H., Hattersley, A., Walker, M., Beulens, J., Mari, A., Schwenk, J., Gupta, R., McCarthy, M., Pearson, E., Bell, J.D., Pavo, I. and Franks, P. |
---|
Abstract | Background: Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in type 2 diabetes (T2D) and beyond. Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and ultimately hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning. Methods and Findings: We utilized the baseline data from the IMI DIRECT, a multicenter prospective cohort study of 3029 European ancestry adults recently diagnosed with T2D (n=795) or at high risk of developing the disease (n=2234). Multiomic (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI image-derived liver fat content (<5% or ³5%) available for 1514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and Random Forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operator characteristic area under the curve (ROCAUC) of 0.84 (95% confidence interval (CI)=0.82, 0.86, p-value<0.001), which compared with a ROCAUC of 0.82 (95% CI=0.81, 0.83, p-value<0.001) for a model including nine clinically-accessible variables. The IMI DIRECT prediction models out-performed existing non-invasive NAFLD prediction tools. These analyses have been performed in adults of European ancestry residing in northern Europe and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that contrast those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome (<5% or ³5%) and not on the liver fat quantity. Conclusions: In this study, we have developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see: www.predictliverfat.org) and made it available to the community. |
---|
Article number | e1003149 |
---|
Journal | PLoS Medicine |
---|
Journal citation | 17 (6) |
---|
ISSN | 1549-1277 |
---|
Year | 2020 |
---|
Publisher | Public Library of Science |
---|
Publisher's version | License CC0 File Access Level Open (open metadata and files) |
---|
Digital Object Identifier (DOI) | https://doi.org/10.1371/journal.pmed.1003149 |
---|
PubMed ID | 32559194 |
---|
Publication dates |
---|
Published | 19 Jun 2020 |
---|
Editors | Heider, D. |
---|