KLASIFIKASI RISIKO OBESITAS BERBASIS GRADIENT BOOSTING PADA DATA MEDIS
DOI:
https://doi.org/10.5281/zenodo.20309818Keywords:
gradient boosting, classification, obesity risk, medical data, machine learningAbstract
Obesity is a growing health concern that can lead to various chronic diseases, making accurate risk identification an important preventive effort. The development of machine learning techniques enables the utilization of medical data to support intelligent decision-making in the healthcare domain. This study aims to apply the Gradient Boosting algorithm as a classification method to predict obesity risk based on medical data. The dataset used contains information related to eating habits, physical activities, and individual characteristics. The research process includes data preprocessing, data transformation and normalization, class mapping, and data partitioning into training and testing sets with a ratio of 70:30. The Gradient Boosting model is constructed using multiple decision trees with specific parameter settings to classify obesity risk into two categories, namely obese and non-obese. Model performance is evaluated using accuracy, precision, recall, and F1-score metrics. The experimental results show that the proposed model achieves good classification performance with an accuracy exceeding 90%, while the performance gap between training and testing data remains relatively small. This indicates that the model has strong generalization capability and does not suffer from overfitting. Therefore, the application of Gradient Boosting on medical data proves to be an effective approach for obesity risk classification and has the potential to support intelligent health information systems in assisting medical practitioners with more precise obesity prevention and management strategies.
Downloads
References
Aditya, N. R. (2015). Data mining. UNIKOM. https://repository.unikom.ac.id/47451/1/Pertemuan%203%20-%20Materi%20%5BDM%20-%202015%5D.pdf
[2] Airlangga, G. (2025). Machine learning-based obesity classification: A comparative study using self-reported survey data and ensemble learning models. Jurnal Teknologi Informatika dan Komputer, 11(1).
[3] Arrahimi, A. R., Ihsan, M. K., Kartini, D., Faisal, M. R., & Indriani, F. (2019). Teknik bagging dan boosting pada algoritma CART untuk klasifikasi masa studi mahasiswa. Jurnal Sains dan Informatika, 5. https://jsi.politala.ac.id/index.php/JSI/article/view/171/94
[4] Arthana, R. (2019, April 5). Mengenal accuracy, precision, recall dan specificity serta yang diprioritaskan. Medium. https://rey1024.medium.com/mengenal-accuracy-precission-recall-dan-specificity-serta-yang-diprioritaskan-b79ff4d77de8
[5] Azhari, M., Situmorang, Z., & Rosnelly, R. (2021). Perbandingan akurasi, recall, dan presisi klasifikasi pada algoritma C4.5, random forest, SVM, dan naive bayes. Jurnal Media Informatika Budidarma, 5, 640–651. https://ejurnal.stmik-budidarma.ac.id/index.php/mib/article/view/2937
[6] Direktorat Pengendalian Penyakit Tidak Menular. (2015). Pedoman umum pengendalian obesitas. Kementerian Kesehatan Republik Indonesia. https://extranet.who.int/ncdccs/Data/IDN_B11_Buku%20Obesitas-1.pdf
[7] EKRUT. (2022, September 28). Dataset adalah: Pengertian, tipe, perbedaan dengan database, dan 10 web penyedia. https://www.ekrut.com/media/dataset-adalah
[8] Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques (3rd ed.). Elsevier.
[9] Mendoza, F. P., & de la H. Manotas, A. (2019). Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from Colombia, Peru and Mexico. Data in Brief, 25, 104344. https://doi.org/10.1016/j.dib.2019.104344
[10] Pujiastuti, P. (2012). Obesitas dan penyakit periodontal. Stomatognatic (J.K.G Unej), 9, 82–85. https://jurnal.unej.ac.id/index.php/STOMA/article/download/2112/1715
[11] Roihan, A., Sunarya, P. A., & Rafika, A. S. (2020). Pemanfaatan machine learning dalam berbagai bidang: Review paper. IJCIT, 5, 75–82. https://ejournal.bsi.ac.id/ejurnal/index.php/ijcit/article/view/7951/pdf
[12] Shalini, K., Shanthi, A. V. K., Shakila, C., & Chamudeeswari, N. (2025). Machine learning approaches for obesity level classification. International Journal of Environmental Sciences, 11.
[13] Syahputra, A. R., Hidayat, R., Rismansyah, F., et al. (2025). Komparasi algoritma machine learning (SVM, random forest, dan regresi logistik) untuk prediksi tingkat obesitas. Jurnal Ilmiah Teknik Informatika dan Komunikasi, 5(3).
[14] Tibshirani, R., Friedman, J., & Hastie, T. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
[15] Yulianti, I. F., & Sihombing, P. R. (2021). Penerapan metode machine learning dalam klasifikasi risiko kejadian berat badan lahir rendah di Indonesia. Matrik: Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer, 20, 417–426. https://journal.universitasbumigora.ac.id/index.php/matrik/article/view/1174/703
