Prediksi Diabetes Berbasis Decision Tree Dengan Menggunakan Dataset Pima Indians Diabetes

Authors

  • Yustri Insani Universitas Katolik Santo Thomas
  • Marcel Filemon Naibaho Universitas Katolik Santo Thomas
  • Sardo Pardingotan Sipayung Universitas Katolik Santo Thomas

DOI:

https://doi.org/10.54259/jdmis.v4i1.7107

Keywords:

Diabetes, Pohon Keputusan, Pembelajaran Mesin, Klasifikasi, kemampuan interpretasi, decision tree, Data Mining, Machine Learning, Classification, Interpretability

Abstract

Diabetes mellitus is a chronic disease characterized by increased blood glucose levels and can lead to various serious complications if not treated early. This research aims to predict diabetes using the Decision Tree algorithm with the Pima Indians Diabetes dataset. The research stages include data processing, forming a Decision Tree model using the entropy criterion, and evaluating model performance. The results show that the model achieved an accuracy of 76.62%. Testing through a confusion matrix produced 83 True Negative samples, 35 True Positive samples, 16 False Positive samples, and 20 False Negative samples. The Glucose attribute was found to be the most dominant factor in the diagnosis, followed by BMI and Age. The resulting model is able to form clear and easy-to-understand decision rules so that it can be used as a decision support system in the early diagnosis of diabetes.

Downloads

Download data is not yet available.

References

R. Y. Averina and I. G. N. J. A. Widagda, “肖沉 1, 2, 孙莉 1, 2∆, 曹杉杉 1, 2, 梁浩 1, 2, 程焱 1, 2,” Tjyybjb.Ac.Cn, vol. 27, no. 2, pp. 635–637, 2021.

J. B. Cole and J. C. Florez, “Genetics of diabetes mellitus and diabetes complications,” Nat. Rev. Nephrol., vol. 16, no. 7, pp. 377–390, 2020, doi: 10.1038/s41581-020-0278-5.

A. Mousa, W. Mustafa, and R. B. Marqas, “A Comparative Study of Diabetes Detection Using The Pima Indian Diabetes Database,” J. Univ. Duhok, vol. 26, no. 2, pp. 277–288, 2023, doi: 10.26682/suod.2023.26.2.24.

B. T. Jijo and A. M. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 1, pp. 20–28, 2021, doi: 10.38094/jastt20165.

I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 3, pp. 1–21, 2021, doi: 10.1007/s42979-021-00592-x.

F. Ardyansyah, E. Daniati, and A. Ristyawan, “Pemanfaatan Data Mining untuk Analisis Keputusan,” Agustus, vol. 8, pp. 2549–7952, 2024.

S. Pewekar, M. Tirkey, A. Mallik, R. Shaikh, and S. A. Wagle, “Diabetes Prediction Using Machine Learning,” Lect. Notes Electr. Eng., vol. 1196 LNEE, no. 8, pp. 67–76, 2024, doi: 10.1007/978-981-97-7862-1_5.

A. H. Nasrullah, “Implementasi Algoritma Decision Tree Untuk Klasifikasi Produk Laris,” J. Ilm. Ilmu Komput., vol. 7, no. 2, pp. 45–51, 2021, doi: 10.35329/jiik.v7i2.203.

H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J. Adv. Signal Process., vol. 2021, no. 1, 2021, doi: 10.1186/s13634-021-00742-6.

O. Y. Inonu, K. Magda, and A. Amarudin, “Analisis Kinerja Algoritma Random Forest Dengan Model Machine Learning Pada Dataset Penyakit Diabetes,” Expert J. Manaj. Sist. Inf. dan Teknol., vol. 15, no. 1, p. 1, 2025, doi: 10.36448/expert.v15i1.4312.

Merdin Shamal Salih, “Diabetic Prediction based on Machine Learning Using PIMA Indian Dataset,” Commun. Appl. Nonlinear Anal., vol. 31, no. 5s, pp. 138–156, 2024, doi: 10.52783/cana.v31.1008.

M. Kahn, “Diabetes,” UCI Machine Learning Repository. [Online]. Available: https://doi.org/10.24432/C5T59G

E. O. Manhitu, Y. P. K. Kelen, and D. Chrisinta, “Implementasi algoritma k-nearest neighbor untuk klasifikasi omset usaha mikro di kabupaten timor tengah utara,” Zo. J. Sist. Inf., vol. 7, no. 1, pp. 304–316, 2025.

Putri and Nur, “Penggunaan Bahasa Python Untuk Analisis Dan Visualisasi Data Penduduk Di Desa Sumberjo, Nganjuk,” J. Pengabdi. Kpd. Masy., vol. 3, no. 3, pp. 206–217, 2023, [Online]. Available: https://jurnalfkip.samawa-university.ac.id/karya_jpm/index

A. S. Saabith, T. Vinothraj, and M. Fareez, “A Review on Python Libraries and Ides for Data Science,” Int. J. Res. Eng. Sci. ISSN, vol. 09, no. 11, pp. 36–53, 2021, [Online]. Available: www.ijres.org

M. Azhari, Z. Situmorang, and R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” J. Media Inform. Budidarma, vol. 5, no. 2, p. 640, 2021, doi: 10.30865/mib.v5i2.2937.

Downloads

Published

2026-02-28

How to Cite

Insani, Y., Naibaho, M. F., & Sipayung, S. P. . (2026). Prediksi Diabetes Berbasis Decision Tree Dengan Menggunakan Dataset Pima Indians Diabetes. JDMIS: Journal of Data Mining and Information Systems, 4(1), 40–45. https://doi.org/10.54259/jdmis.v4i1.7107