Tree Based Boosting Algorithm to Tackle the Overfitting in Healthcare Data

Main Article Content

Blessa Binolin Pepsi M
Vidhya S
Ashwini A

Abstract

Healthcare data refers to information about an individual's or population's health issues, reproductive results, causes of mortality, and quality of life. When people interact with healthcare systems, a variety of health data is collected and used. However, these healthcare data are noisy as well as it prone to over-fitting. Over-fitting is a modeling error in statistics that occurs when a function is too closely aligned to a limited set of data points. As a result, the model learns the information and noise in the training data to the point where it degrades the model's performance on fresh data. The tree-based boosting approach works well on over-fitted data and is well suited for healthcare data. Improved Paloboost performs trimmed gradient and updated learning rate using Out-of-Bag mistakes collected from Out-of-Bag data. Out-of-Bag data are the data that are not present in In-Bag data. Improved Paloboost's outcome will protect against over-fitting in noisy healthcare data and outperform all tree baseline models. The Improved Paloboost is better at avoiding over-fitting of data and is less sensitive, according to experimental results on health-care datasets.

Article Details

How to Cite
Pepsi M, B. B. ., V. . S, and A. . A. “Tree Based Boosting Algorithm to Tackle the Overfitting in Healthcare Data”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 5, May 2022, pp. 41-47, doi:10.17762/ijritcc.v10i5.5552.
Section
Articles