Tree Based Boosting Algorithm to Tackle the Overfitting in Healthcare Data

Blessa Binolin  Pepsi M; Vidhya  S; Ashwini  A

doi:10.17762/ijritcc.v10i5.5552

PDF

Published: May 31, 2022

DOI: https://doi.org/10.17762/ijritcc.v10i5.5552

Blessa Binolin Pepsi M

Vidhya S

Ashwini A

Abstract

Healthcare data refers to information about an individual's or population's health issues, reproductive results, causes of mortality, and quality of life. When people interact with healthcare systems, a variety of health data is collected and used. However, these healthcare data are noisy as well as it prone to over-fitting. Over-fitting is a modeling error in statistics that occurs when a function is too closely aligned to a limited set of data points. As a result, the model learns the information and noise in the training data to the point where it degrades the model's performance on fresh data. The tree-based boosting approach works well on over-fitted data and is well suited for healthcare data. Improved Paloboost performs trimmed gradient and updated learning rate using Out-of-Bag mistakes collected from Out-of-Bag data. Out-of-Bag data are the data that are not present in In-Bag data. Improved Paloboost's outcome will protect against over-fitting in noisy healthcare data and outperform all tree baseline models. The Improved Paloboost is better at avoiding over-fitting of data and is less sensitive, according to experimental results on health-care datasets.

How to Cite

Pepsi M, B. B. ., S, V. ., & A, A. . (2022). Tree Based Boosting Algorithm to Tackle the Overfitting in Healthcare Data. International Journal on Recent and Innovation Trends in Computing and Communication, 10(5), 41–47. https://doi.org/10.17762/ijritcc.v10i5.5552