Classifying subjects into risk categories is a common chal-lenge in medical research. Machine Learning (ML) methods are widely used in the areas of risk prediction and classification. The primary objec-tive of such algorithms is to use several features to predict dichotomous responses (e.g., healthy/at risk). Similar to statistical inference model-ling, ML modelling is subject to the problem of class imbalance and is affected by the majority class, increasing the false-negative rate. In this study, we built and evaluated thirty-six ML models to classify approximately 4300 female and 4100 male participants from the UK Bi-obank into three categorical risk statuses based on discretised visceral adipose tissue (VAT) measurements from magnetic resonance imaging. We also examined the effect of sampling techniques on the models when dealing with class imbalance. The sampling techniques used had a considerable impact on the classifi-cation and resulted in an improvement in risk status prediction by facili-tating an increase in the information contained within each variable. Based on domain expert criteria the best three classification models for the female and male cohort visceral fat prediction were identified. The Area Under Receiver Operator Characteristic curve of the models tested (with external data) was 0.78 to 0.89 for females and 0.75 to 0.86 for males. These encouraging results will be used to guide further development of models to enable prediction of VAT value. This will be useful to identify individuals with excess VAT volume who are at risk of developing met-abolic disease ensuring relevant lifestyle interventions can be appropri-ately targeted. |