Explainable Artificial Intelligence based Ensemble Machine Learning for Ovarian Cancer Stratification using Electronic Health Records

Main Article Content

Vivekanand Aelgani
Dhanalaxmi Vadlakonda


The purpose of this study is to show how ensemble learning-driven machine learning algorithms outperform individual machine learning algorithms at predicting ovarian cancer on a biomarker dataset. Additionally, this study provides model explanations using explainable Artificial Intelligence methods, The method involved gathering and combining 49 risk factors from 349 patients. We hypothesize that ensemble machine learning systems are superior to individual Machine Learning systems in predicting ovarian cancer. The Machine Learning system consists of five individual Machine Learning and five ensemble Machine Learning systems were trained using K-10 cross validation protocols. These training models were then used to predict the development of benign ovarian tumors and ovarian cancer tumors patients. The AUC and Accuracy metrics for ensemble machine learning increased by 19% and 16%. The MCC and Kappa scores for ensemble Machine Learning also increased over individual machine learning by 29% and 33%, respectively. As a result, we draw the conclusion that ensembled-based algorithms outperform individual machine learning in terms of ovarian carcinoma prediction.

Article Details

How to Cite
Aelgani, V. ., & Vadlakonda, D. . (2023). Explainable Artificial Intelligence based Ensemble Machine Learning for Ovarian Cancer Stratification using Electronic Health Records. International Journal on Recent and Innovation Trends in Computing and Communication, 11(7), 78–84. https://doi.org/10.17762/ijritcc.v11i7.7832


H. Sung et al., "Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries," CA: a cancer journal for clinicians, vol. 71, no. 3, pp. 209-249, 2021.

W. Guan et al., "Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by support vector machines," vol. 10, no. 1, pp. 1-15, 2009.

A. M. Alqudah, "Ovarian cancer classification using serum proteomic profiling and wavelet features a comparison of machine learning and features selection algorithms," Journal of Clinical Engineering, vol. 44, no. 4, pp. 165-173, 2019.

E. Kawakami et al., "Application of Artificial Intelligence for Preoperative Diagnostic and Prognostic Prediction in Epithelial Ovarian Cancer Based on Blood BiomarkersArtificial Intelligence in Epithelial Ovarian Cancer," Clinical cancer research, vol. 25, no. 10, pp. 3006-3015, 2019.

E. S. Paik et al., "Prediction of survival outcomes in patients with epithelial ovarian cancer using machine learning methods," Journal of gynecologic oncology, vol. 30, no. 4, 2019.

M. Akazawa and K. Hashimoto, "Artificial intelligence in ovarian cancer diagnosis," Anticancer research, vol. 40, no. 8, pp. 4795-4800, 2020.

Rastogi, A. K. ., Taterh , S. ., & Kumar, B. S. . (2023). Dimensionality Reduction Approach for High Dimensional Data using HGA based Bio Inspired Algorithm. International Journal of Intelligent Systems and Applications in Engineering, 11(2s), 227 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2621

M. Lu et al., "Using machine learning to predict ovarian cancer," International journal of medical informatics, vol. 141, p. 104195, 2020.

C. Lu, T. Van Gestel, J. A. Suykens, S. Van Huffel, I. Vergote, and D. Timmerman, "Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines," Artificial Intelligence in Medicine, vol. 28, no. 3, pp. 281-306, 2003.

J. Dong and M. Xu, "A 19?miRNA Support Vector Machine classifier and a 6?miRNA risk score system designed for ovarian cancer patients Corrigendum in/10.3892/or. 2019.7385," Oncology reports, vol. 41, no. 6, pp. 3233-3243, 2019.

Z. Zhang and Y. Han, "Detection of ovarian tumors in obstetric ultrasound imaging using logistic regression classifier with an advanced machine learning approach," IEEE Access, vol. 8, pp. 44999-45008, 2020.

J. Kaijser et al., "Improving strategies for diagnosing ovarian cancer: a summary of the International Ovarian Tumor Analysis (IOTA) studies," Ultrasound in obstetrics & gynecology, vol. 41, no. 1, pp. 9-20, 2013.

V. Wibowo, Z. Rustam, S. Hartini, F. Maulidina, I. Wirasati, and W. Sadewo, "Ovarian cancer classification using K-Nearest Neighbor and Support Vector Machine," in Journal of Physics: Conference Series, 2021, vol. 1821, no. 1: IOP Publishing, p. 012007.

B. Wu et al., "Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data," Bioinformatics, vol. 19, no. 13, pp. 1636-1643, 2003.

A. Vlahou, J. O. Schorge, B. W. Gregory, and R. L. Coleman, "Diagnosis of ovarian cancer using decision tree classification of mass spectral data," Journal of Biomedicine and Biotechnology, vol. 2003, no. 5, pp. 308-314, 2003.

A. Osmanovi?, L. Abdel-Ilah, A. Hodži?, J. Kevric, and A. Fojnica, "Ovary cancer detection using decision tree classifiers based on historical data of ovary cancer patients," in CMBEBIH 2017: Proceedings of the International Conference on Medical and Biological Engineering 2017, 2017: Springer, pp. 503-510.

M.-H. Tsai, H.-C. Wang, G.-W. Lee, Y.-C. Lin, and S.-H. Chiu, "A decision tree based classifier to analyze human ovarian cancer cDNA microarray datasets," Journal of medical systems, vol. 40, pp. 1-8, 2016.

S. K. Maliha, R. R. Ema, S. K. Ghosh, H. Ahmed, M. R. J. Mollick, and T. Islam, "Cancer disease prediction using naive bayes, K-nearest neighbor and J48 algorithm," in 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2019: IEEE, pp. 1-7.

H. Zhang et al., "A random forest-based metabolic risk model to assess the prognosis and metabolism-related drug targets in ovarian cancer," Computers in Biology and Medicine, vol. 153, p. 106432, 2023.

L. Cheng, L. Li, L. Wang, X. Li, H. Xing, and J. Zhou, "A random forest classifier predicts recurrence risk in patients with ovarian cancer," Molecular Medicine Reports, vol. 18, no. 3, pp. 3289-3297, 2018.

A. Arfiani and Z. Rustam, "Ovarian cancer data classification using bagging and random forest," in AIP Conference Proceedings, 2019, vol. 2168, no. 1: AIP Publishing.

B. Yesilkaya, M. Perc, and Y. Isler, "Manifold learning methods for the diagnosis of ovarian cancer," Journal of Computational Science, vol. 63, p. 101775, 2022.

Y.-W. Hsiao, C.-L. Tao, E. Y. Chuang, and T.-P. Lu, "A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models," Journal of advanced research, vol. 30, pp. 113-122, 2021.

Mr. Rahul Sharma. (2018). Monitoring of Drainage System in Urban Using Device Free Localization Neural Networks and Cloud computing. International Journal of New Practices in Management and Engineering, 7(04), 08 - 14. https://doi.org/10.17762/ijnpme.v7i04.69

X. Yang, M. Khushi, and K. Shaukat, "Biomarker CA125 feature engineering and class imbalance learning improves ovarian cancer prediction," in 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), 2020: IEEE, pp. 1-6.

V. Aelgani, D. Vadlakonda, and V. Lendale, "Performance analysis of predictive models on class balanced datasets using oversampling techniques," in Soft Computing and Signal Processing: Proceedings of 3rd ICSCSP 2020, Volume 1, 2021: Springer, pp. 375-383.

Omondi, P., Ji-hoon, P., Cohen, D., Silva, C., & Tanaka, A. Deep Learning-Based Object Detection for Autonomous Vehicles. Kuwait Journal of Machine Learning, 1(4). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/149

A. Abdollahi and B. Pradhan, "Urban vegetation mapping from aerial imagery using explainable AI (XAI)," Sensors, vol. 21, no. 14, p. 4738, 2021.

U. Pawar, D. O'Shea, S. Rea, and R. O'Reilly, "Incorporating Explainable Artificial Intelligence (XAI) to aid the Understanding of Machine Learning in the Healthcare Domain," in AICS, 2020, pp. 169-180.

M. T. Ribeiro, S. Singh, and C. Guestrin, "" Why should i trust you?" Explaining the predictions of any classifier," in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135-1144.

S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," Advances in neural information processing systems, vol. 30, 2017.