AUTHORS: Sirage Zeynu, Shruti Patil
Download as PDF
ABSTRACT: The failure of the kidney is affected the whole human body and it can be a cause of the seriously ill and cause of deaths. Machine learning and data mining techniques are the most significant role in disease prediction with high-performance rate and used to help decision makers to assemble and understand information. The performance of classification techniques depends on the feature of the data set. To improve the accuracy of classification used feature selection method by reducing the dimensions of the feature and used ensemble or combine a model of the algorithm. In this research K-Nearest Neighbor, J48, Artificial Neural Network, Naïve Bayes and Support Vector Machine classification techniques were used to diagnose Chronic Kidney Disease. To predict chronic kidney disease, build two important models. Namely, feature selection method and ensemble model. To build chronic kidney disease prediction, used Info gain attributes evaluator with ranker search engine and wrapper subset evaluator with the best first engine was used. The result showed that the K-nearest neighbor classifier by using Wrapper Sub set Evaluator with Best first search engine feature selection method has 99% accuracy, J48 with Info Gain Attribute Evaluator with ranker search engine has 98.75, Artificial Neural Network with Wrapper Sub set Evaluator with Best first search engine has 99.5% accuracy, Naïve Bayes with Wrapper Sub set Evaluator with Best first search engine has 99% accuracy, Support Vector Machine with Info Gain Attribute Evaluator with ranker has 98.25% accuracy in prediction of chronic kidney disease compared to other with and without feature section method. The second model building method ensemble model by combing the five heterogeneous classifiers based on a voting algorithm. The effectiveness of the proposed ensemble model was examined by comparison of the base classifier. The experimental result showed that the proposed ensemble model achieved 99% accuracy.
KEYWORDS: — Chronic Kidney Disease, Data Mining, Classification Techniques, Feature Selection, Ensemble model, accuracy, prediction
REFERENCES:
[1] K. Chandel, V. Kunwar, S. Sabitha, T. Choudhury, and S. Mukherjee, “A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques,” CSI Trans. ICT, vol. 4, no. 2–4, pp. 313–319, Dec. 2016.
[2] B. Boukenze, A. Haqiq, and H. Mousannif, “Predicting Chronic Kidney Failure Disease Using Data Mining Techniques,” in Advances in Ubiquitous Networking 2, Springer, Singapore, 2017, pp. 701–712.
[3] T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Decision tree and random forest models for outcome prediction in antibody incompatible kidney transplantation,” Biomed. Signal Process. Control, Feb. 2017.
[4] R. Ani, G. Sasi, U.R. Sankar, & O.S. Deepa, (2016, September). “Decision support system for diagnosis and prediction of chronic renal failure using random subspace classification – I EEE Conference Publication.”
[Online]. Available: http://ieeexplore.ieee.org/abstract/document/7732224/?reload=true.
[Accessed: 15-Dec-2017].
[5] L.-C. Cheng, Y.-H. Hu, and S.-H. Chiou, “Applying the Temporal Abstraction Technique to the Prediction of Chronic Kidney Disease Progression,” J. Med. Syst., vol. 41, no. 5, p. 85, May 2017.
[6] H. Polat, H. D. Mehr, and A. Cetin, “Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods,” J. Med. Syst., vol. 41, no. 4, p. 55, Apr. 2017.
[7] P. Pangong and N. Iam-On, “Predicting transitional interval of kidney disease stages 3 to 5 using data mining method,” in 2016 Second Asian Conference on Defence Technology (ACDT), 2016, pp. 145–150.
[8] K. R. A. Padmanaban and G. Parthiban, “Applying Machine Learning Techniques for Predicting the Risk of Chronic Kidney Disease,” Indian J. Sci. Technol., vol. 9, no. 29, Aug. 2016.
[9] S. Perveen, M. Shahbaz, A. Guergachi, and K. Keshavjee, “Performance Analysis of Data Mining Classification Techniques to Predict Diabetes,” Procedia Comput. Sci., vol. 82, no. Supplement C, pp. 115–121, Jan. 2016.
[10] U. N. Dulhare and M. Ayesha, “Extraction of action rules for chronic kidney disease using Naive Bayes classifier,” in 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 2016, pp. 1–5.
[11] N. Borisagar, D. Barad, and P. Raval, “Chronic Kidney Disease Prediction Using Back Propagation Neural Network Algorithm,” in Proceedings of International Conference on Communication and Networks, Springer, Singapore, 2017, pp. 295–303.
[12] A. I. Pritom, M. A. R. Munshi, S. A. Sabab, and S. Shihab, “Predicting breast cancer recurrence using effective classification and feature selection technique,” in 2016 19th International Conference on Computer and Information Technology (ICCIT), 2016, pp. 310–314.
[13] S. Mishra, P. Chaudhury, B. K. Mishra, and H. K. Tripathy, “An Implementation of Feature Ranking Using Machine Learning Techniques for Diabetes Disease Prediction,” in Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, New York, NY, USA, 2016, p. 42:1–42:3.
[14] D. Zufferey, T. Hofer, J. Hennebert, M. Schumacher, R. Ingold, and S. Bromuri, “Performance comparison of multi-label learning algorithms on clinical data for chronic diseases,” Comput. Biol. Med., vol. 65, no. Supplement C, pp. 34–43, Oct. 2015.
[15] S. Bashir, U. Qamar, F. H. Khan, and M. Y. Javed, “MV5: A Clinical Decision Support Framework for Heart Disease Prediction Using Majority Vote Based Classifier Ensemble,” Arab. J. Sci. Eng., vol. 39, no. 11, pp. 7771–7783, Nov. 2014.
[16] T.R. Baitharu, & S.K. Pani, (2016). “Analysis of Data Mining Techniques for Healthcare Decision Support System Using Liver Disorder Dataset- ScienceDirect.”
[Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877050916306263.
[Accessed: 15-Dec-2017].
[17] Z. Sedighi, H. Ebrahimpour-Komleh, and S. J. Mousavirad, “Feature selection effects on kidney disease analysis,” in 2015 International Congress on Technology, Communication and Knowledge (ICTCK), 2015, pp. 455–459.
[18] D.M. Filimon, & A. Albu, (2014, May “Skin diseases diagnosis using artificial neural networks - IEEE Conference Publication.”
[Online]. Available: http://ieeexplore.ieee.org/abstract/document/6840059/.
[Accessed: 15-Dec-2017].
[19] F. Ahmad, N. A. M. Isa, Z. Hussain, M. K. Osman, and S. N. Sulaiman, “A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer,” Pattern Anal. Appl., vol. 18, no. 4, pp. 861–870, Nov. 2015.
[20]
[20] H. Asri, H. Mousannif, H. Al Moatassime, & T. Noel, (2016). “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis ScienceDirect.”
[Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877050916302575.
[Accessed: 15-Dec-2017].
[21] B.L Deekshatulu, & P. Chandra, (2013 “Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm ScienceDirect.”
[Online].Available: http://www.sciencedirect.com/science/article/pii/S2212017313004945.
[Accessed: 15-Dec-2017].
[22] S. Ramya, and N. Radha. 'Diagnosis of chronic kidney disease using machine learning algorithms.' International Journal of Innovative Research in Computer and Communication Engineering vol. 4, no. 1 pp. 812-820. 2016.
[23] R. Dhruvi, R. Yavnika, & R. Nutan, ' Prediction of Probability of Chronic Diseases and Providing Relative Real-Time Statistical Report using data mining and machine learning techniques'. International Journal of Science, Engineering, and Technology Research (IJSETR) vol. 5, no. 4. 2016.
[24] S. Vijayarani, S. Dhayanand, and M. Phil. 'Kidney disease prediction using SVM and ANN algorithms.' International Journal of Computing and Business Research (IJCBR) vol. 6, no. 2, 2015.
[25] N. Chetty, S. V. Kunwar, and S. D. Sudarsan. 'Role of attributes selection in the classification of Chronic Kidney Disease patients.' In Computing, Communication and Security (ICCCS), 2015 International Conference on, pp. 1-6. IEEE, 2015.
[26] http://www.datasciencecentral.com/profiles/blogs/python-resources-fortop-data-mining-algorithms
[27] https://archive.ics.uci.edu/ml/datasets/chronic_kidney_disease
[28] J. Radhakrishnan, and M. Sumit, 'KI Reports and World Kidney Day.' Kidney international reports vol.2, no. 2, pp. 125-126, Mar. 2017.
[29] https://www.kidney.org/kidneydisease/global-facts-about-kidneydisease#_ENREF_1
[30] P. Ahmad, Q. Saqib, and S. Q. A. Rizvi. 'Techniques of data mining in healthcare: a review.' International Journal of Computer Applications Vol. 120, no. 15 Jan. 2015.
[31] B. R. Sharma, K. Daljeet, and A. Manju. 'Review on Data Mining: Its Challenges, Issues and Applications.' International Journal of Current Engineering and Technology vol. 3, no. 2 jun. 2013.
[32] Bashir, S., Qamar, U., Khan, F. H., & Naseem, L. (2016). HMV: a medical decision support framework using multi-layer classifiers for disease prediction. Journal of Computational Science, 13, 10-25.