Institusion
Universitas Muhammadiyah Surakarta
Author
Rif'ani Ma'rifah, Fannisa
, Dr. Eng. Yusuf Sulistyo Nugroho, S.T., M.Eng.
Subject
QA75 Electronic computers. Computer science
Datestamp
2022-05-23 02:07:46
Abstract :
Diabetes is a disease with elevated blood glucose or sugar levels caused by the pancreas not being able to produce enough insulin. According to data from the World Health Organization (WHO), about 387 million people in the worldwide suffer from diabetes. Most people who suffer from diabetes are caused by hereditary factors. This study aims to classify the variables that influence diabetes using a data mining technique. The classification applies 3 algorithmic of decision trees, namely information gain, gain ratio, and Gini index to find maximum performance and find out the best algorithm to be applied in diabetes classification. The attributes needed in the classification are pregnancy, glucose, blood pressure, skin thickness, insulin, BMI, history of diabetes, and age. The results show that the factors that most influenced the occurrence of diabetes based on datasets processed using 3 algorithms is glucose. The results also show that the accuracy, precision and recall of index gini are higher than the other 2 algorithms, that is information gain and gain ratio with an accuracy of 76.56. %, 80.79% precision, and 84.00% recall. Based on this comparison, it can be seen that the index gini has a better performance when used in the classification of diabetes.