Abstract:
Diabetes mellitus is a group of metabolic disorders known as ‘diabetes’ ,it has affected
hundreds of millions of individuals. Diabetes detection is of great significancewith regard to
its serious complications. Many studies on diabetes prediction datasets have been
conducted,where most of it are the studies on diabetes collected from individual,and it is also
where the onset of diabetes dataset is high,studying the female in Pima Indian natives
population during 1967. Most of the previous studies concentrated primarily on one or two
specific complicated technique to test the data, while there is a lack of extensive research on
popular technique . In this paper, we are conducting a thorough exploration of the most
common technques like SVM (Support Vector Machine), (K Nearest Neighbors),etc.) used
for identifying diabetes and other preprocessing method. Basically, we examine these
techniques by precision of cross-validation on the dataset. We compare each classifier ‘s
apects of analyzing and we modify the parameters to improve their accuracy. The best
technique we find has 77.86% accuracy using 10-fold cross-validation.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Information and Communication Engineering of East West University, Dhaka, Bangladesh