Interpretable semisupervised classifier for predicting cancer stages : WestminsterResearch

Publication dates
Chapter title	Interpretable semisupervised classifier for predicting cancer stages
Authors	Grau, I., Sengupta, D. and Nowe, A.
Editors	Kumar, P., Kumar, Y. and Tawhid, M.A.
Abstract	Machine learning techniques in medicine have been at the forefront addressing challenges such as diagnosis, prognosis prediction, or precision medicine. In this field, the data are sometimes abundant but comes from different data sources or lack assigned labels. The process of manually labeling these data when conforming to a curated dataset for supervised classification can be costly. Semisupervised classification offers a wide range of methods for leveraging unlabeled data when learning prediction models. However, these classifiers are commonly deep or ensemble learning structures that often result in black boxes. The requirement of interpretable models for medical settings led us to propose the self-labeling gray box classifier, which outperforms other semisupervised classifiers on benchmarking datasets while providing interpretability. In this chapter, we illustrate the applications of the self-labeling gray box on the omics and clinical datasets from the cancer genome atlas. We show that the self-labeling gray box is accurate in predicting cancer stages of rare cancers by leveraging the unlabeled instances from more common cancer types. We discuss insights, the features influencing prediction, and a global representation of the knowledge through decision trees or rule lists, which can aid clinicians and researchers.
Book title	Machine Learning, Big Data, and IoT for Medical Informatics
Page range	241-259
Year	2021
Publisher	Academic Press
Published	18 Jun 2021
ISBN	9780128217771
Digital Object Identifier (DOI)	https://doi.org/10.1016/b978-0-12-821777-1.00006-9
Web address (URL)	http://dx.doi.org/10.1016/b978-0-12-821777-1.00006-9
Journal	Machine Learning, Big Data, and IoT for Medical Informatics

Related outputs

Doctors’ and nurses’ eating practices during shift work: Findings from a qualitative study
Sum, K., Cheshire, A., Ridge, Damien T., Sengupta, D. and Deb, D.S. 2024. Doctors’ and nurses’ eating practices during shift work: Findings from a qualitative study. Proceedings of the Nutrition Society. 83 (OCE2), p. E204. https://doi.org/10.1017/s0029665124004282

The Cavendish Living lab - a multidisciplinary, vertically integrated project focused on sustainability
Basnett, P., Percy, L., Sengupta, D. and Smith, C.L. 2023. The Cavendish Living lab - a multidisciplinary, vertically integrated project focused on sustainability. Westminster Learning and Teaching Symposium 2023: Better Than the Real Thing? Exploring Education Futures at the University of Westminster. University of Westminster 04 Sep 2023

Antiviral Drug Target Identification and Ligand Discovery
Patel, Hershna and Sengupta, Dipankar 2023. Antiviral Drug Target Identification and Ligand Discovery. in: Gore, M. and Jagtap, U.B. (ed.) Computational Drug Discovery and Design Springer.

Smart Urban Metabolism: A Big-Data and Machine Learning Perspective
Ruchira Ghosh and Dipankar Sengupta 2023. Smart Urban Metabolism: A Big-Data and Machine Learning Perspective. in: Urban Metabolism and Climate Change Springer. pp. 325–344

Inhibiting CDK4/6 in pancreatic ductal adenocarcinoma via microRNA-21
Mortoglou, M., Miralles, F., Mould, R., Sengupta, D. and Uysal Onganer, P. 2023. Inhibiting CDK4/6 in pancreatic ductal adenocarcinoma via microRNA-21. European Journal of Cell Biology. 102 (2) 151318. https://doi.org/10.1016/j.ejcb.2023.151318

Artificial Intelligence in Precision Medicine: A Perspective in Biomarker and Drug Discovery
Sengupta, D. and Santoshi, S. 2021. Artificial Intelligence in Precision Medicine: A Perspective in Biomarker and Drug Discovery. in: Saxena, A. and Chandra, S. (ed.) Artificial Intelligence and Machine Learning in Healthcare Springer. pp. 71-88

An ensemble approach for evaluating the cognitive performance of human population at high altitude
Sengupta, D., Sharma, V.K., Hota, S.K., Srivastava, R.B. and Naik, P.K. 2021. An ensemble approach for evaluating the cognitive performance of human population at high altitude. in: Kumar, P., Kumar, Y. and Tawhid, M.A. (ed.) Machine Learning, Big Data, and IoT for Medical Informatics Academic Press. pp. 165-178

Machine learning in precision medicine
Sengupta, D. 2021. Machine learning in precision medicine. in: Kumar, P., Kumar, Y. and Tawhid, M.A. (ed.) Machine Learning, Big Data, and IoT for Medical Informatics Academic Press. pp. 405-419

An interpretable semi-supervised classifier using rough sets for amended self-labeling
Grau, I., Sengupta, D., Garcia Lorenzo, M.M. and Nowe, A. 2020. An interpretable semi-supervised classifier using rough sets for amended self-labeling. IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2020). Glasgow, UK 19 - 24 Jul 2020 IEEE . https://doi.org/10.1109/fuzz48607.2020.9177549

Training set edition using Rough Set Theory for Semi-supervised Classification
Grau, I., Nápoles, G., Sengupta, D., García Lorenzo, M.M. and Nowe, A. 2017. Training set edition using Rough Set Theory for Semi-supervised Classification. 2nd International Symposium on Fuzzy and Rough Sets. Villa Clara, Cuba 24 - 26 Oct 2017 Editorial Feijoó.

Genomic Variant Classifier Tool
Grau, I., Sengupta, D., Farid, D.M., Manderick, B., Nowe, A., Garcia Lorenzo, M.M., Daneels, D., Bonduelle, M., Croes, D. and Van Dooren , S. 2016. Genomic Variant Classifier Tool. SAI Intelligent Systems Conference 2016. London 21 Sep 2016 Springer. https://doi.org/10.1007/978-3-319-56994-9_32

GeVaCT - Genomic Variant Classifier Tool
Daneels, D., Grau, I., Sengupta, D., Bonduelle, M.L., Farid, D., Croes, D., Nowé, A. and Van Dooren, S. 2016. GeVaCT - Genomic Variant Classifier Tool. European Journal of Human Genetics. 24 (E-Supplement 1), p. 341.

Grey-Box Model: An ensemble approach for addressing semi-supervised classification problems
Sengupta, D., Grau, I., Garcia Lorenzo, M.M. and Nowe, A. 2016. Grey-Box Model: An ensemble approach for addressing semi-supervised classification problems. Benelearn 2016: Belgian-Dutch Conference on Machine Learning. Katholieke Universiteit Leuven, Campus Kortrijk (KULAK) 12 - 13 Sep 2016

GEVACT: Genomic Variant Classifier Tool
Daneels, D., Grau, I., Sengupta, D., Bonduelle, M., Farid, D.M., Croes, D., Nowé, A. and Van Dooren, S. 2015. GEVACT: Genomic Variant Classifier Tool. BeSHG & NVHG First Joint Meeting “Genetics & Society”. Leuven, Belgium 04 - 05 Feb 2016

CliniPhenome: Clinical and Phenotypic Annotation Database
Sengupta, D., Croes, D., Van Dooren, S., Bonduelle, M. and Nowe, A. 2015. CliniPhenome: Clinical and Phenotypic Annotation Database. BeMGI Annual Meeting 2015. Ghent, Belgium

GeVaCT: Genomic Variant Classifier Tool
Grau, I., Daneels, D., Van Dooren, S., Bonduelle , M., Farid, D.M., Croes, D., Nowé, A. and Sengupta, D. 2015. GeVaCT: Genomic Variant Classifier Tool. 10th Benelux Bioinformatics Conference. University of Antwerp, Belgium 07 - 08 Dec 2015

Benchmarking pre-processing and batch effect removal methods for Insilico DB: Genomics Big Data Infrastructure
De Clerck, Q., Nowe, A., Coletta, A. and Sengupta, D. 2014. Benchmarking pre-processing and batch effect removal methods for Insilico DB: Genomics Big Data Infrastructure. 9th Benelux Bioinformatics Conference (BBC 2014). Novotel-Kirchberg, Luxembourg 08 - 09 Dec 2014

Homology Modeling of Bacteriocins: From sequence alignments to structural models
Atri, P., Sengupta, D., Verma, S., Ali, S. and Dey, G. 2014. Homology Modeling of Bacteriocins: From sequence alignments to structural models. International Journal of Scientific & Engineering Research. 5 (5), pp. 123-126.

Association rule mining based study for identification of clinical parameters akin to occurrence of brain tumor
Sengupta, D., Sood, M., Vijayvargia, P., Hota, S. and Naik, P.K. 2013. Association rule mining based study for identification of clinical parameters akin to occurrence of brain tumor. Bioinformation. 9 (11), pp. 555-559. https://doi.org/10.6026/97320630009555

Design of dimensional model for clinical data storage and analysis
Sengupta, D., Arora, P., Pant, S. and Naik, P.K. 2013. Design of dimensional model for clinical data storage and analysis. Applied Medical Informatics. 32 (2), pp. 47-53.

SN algorithm: analysis of temporal clinical data for mining periodic patterns and impending augury
Sengupta, D. and Naik, P.K. 2013. SN algorithm: analysis of temporal clinical data for mining periodic patterns and impending augury. Journal of Clinical Bioinformatics. 3 24. https://doi.org/10.1186/2043-9113-3-24

TpPred: A Tool for Hierarchical Prediction of Transport Proteins Using Cluster of Neural Networks and Sequence Derived Features
Jain, S., Ranjan, P., Sengupta, D. and Naik, P.K. 2012. TpPred: A Tool for Hierarchical Prediction of Transport Proteins Using Cluster of Neural Networks and Sequence Derived Features. International Journal for Computational Biology. 1 (1), pp. 28-36.

Mode of interaction of calcium oxalate crystal with human phosphate cytidylyltransferase 1: a novel inhibitor purified from human renal stone matrix
Pathak, P., Naik, P.K., Sengupta, D., Singh, S.K. and Tandon, C. 2011. Mode of interaction of calcium oxalate crystal with human phosphate cytidylyltransferase 1: a novel inhibitor purified from human renal stone matrix. Journal of Biomedical Science and Engineering. 4 (9), pp. 591-598. https://doi.org/10.4236/jbise.2011.49075

Docking-MM-GB/SA and ADME screening of HIV-1 NNRTI inhibitor: Nevirapine and its analogues
Sengupta, D., Verma, D. and Naik, P.K. 2008. Docking-MM-GB/SA and ADME screening of HIV-1 NNRTI inhibitor: Nevirapine and its analogues. In Silico Biology. 8 (3-4), pp. 275-289.

Binding Modes, binding Affinities and ADME Screening of HIV-1 NNRTI Inhibitor: Efavirnez and its analogues.
Sengupta, D. 2007. Binding Modes, binding Affinities and ADME Screening of HIV-1 NNRTI Inhibitor: Efavirnez and its analogues. Online Journal of Bioinformatics. 8 (1), pp. 99-114.

In-silico TAT-PTD prediction for cell penetrating peptides
Tandon, C., Aggarwal, A., Goel, P., Sengupta, D. and Naik, P.K. 2007. In-silico TAT-PTD prediction for cell penetrating peptides. Online Journal of Bioinformatics. 8 (1), pp. 115-138.

Docking mode of delvardine and its analogues into the p66 domain of HIV-1 reverse transcriptase: Screening using molecular mechanics-generalized born/surface area and absorption, distribution, metabolism and excretion properties
Sengupta, D., Verma, D. and Naik, P.K. 2007. Docking mode of delvardine and its analogues into the p66 domain of HIV-1 reverse transcriptase: Screening using molecular mechanics-generalized born/surface area and absorption, distribution, metabolism and excretion properties. Journal of Biosciences. 32 (3), pp. 1307-1316. https://doi.org/10.1007/s12038-007-0140-y

Docking mode of delvardine and its analogues into the p66 domain of HIV-1 reverse transcriptase: screening using molecular mechanics-generalized born/surface area and absorption, distribution, metabolism and excretion properties
Sengupta, D., Verma, D. and Naik, P.K. 2007. Docking mode of delvardine and its analogues into the p66 domain of HIV-1 reverse transcriptase: screening using molecular mechanics-generalized born/surface area and absorption, distribution, metabolism and excretion properties. Journal of Biosciences. 32, pp. 1307-1316. https://doi.org/10.1007/s12038-007-0124-y

Clustering of HIV-I Subtype: Study of Molecular Diversity using Phylogenetic Analysis
Sengupta, D., Verma, D., Mishra, V.S. and Naik, P.K. 2006. Clustering of HIV-I Subtype: Study of Molecular Diversity using Phylogenetic Analysis. Bioinformatics Trends: A Journal of Bioinformatics and its Applications. 1 (1), pp. 1-12.

Permalink - https://westminsterresearch.westminster.ac.uk/item/vw226/interpretable-semisupervised-classifier-for-predicting-cancer-stages

Interpretable semisupervised classifier for predicting cancer stages

Related outputs

Share this

Usage statistics

Export as