Submission Deadline-12th July 2024
June 2024 Issue : Publication Fee: 30$ USD Submit Now
Submission Deadline-20th July 2024
Special Issue of Education: Publication Fee: 30$ USD Submit Now

International Journal of Research and Innovation in Applied Science (IJRIAS) | Volume VII, Issue VII, July 2022 | ISSN 2454–6194

A Hybrid Method: Hierarchical Agglomerative Clustering Algorithm with Classification Techniques for Effective Heart Disease Prediction

 Farha Akhter Munmun, Sumi Khatun
Department of Computer Science and Engineering, Bangladesh University of Business and Technology
Dhaka, Bangladesh

IJRISS Call for paper

Abstract: Prediction of heart disease is challenging because countless data are collected for clinical data analysis, but all this information is not equally important for making the right decisions. We have proposed a hybrid method: Hierarchical Agglomerative Clustering algorithm combined with conventional classification techniques such as K-Nearest Neighbors (K-NN), Decision Tree (J48), and Naïve Bayes which aims to reduce the prediction time by clustering the patients having almost similar symptoms of heart failure. This approach minimizes the forecasting time based on clusters of patients instead of individual patients. Moreover, a comparison between the classification techniques and our approach is depicted based on precision, recall, F1 score, accuracy, and prediction time. The accuracies of the classifiers (K-NN-66.67%, J48-83.33%, and Naïve Bayes83.33%) of our system have slightly decreased compared with the conventional methods (K-NNN-69.128%, J48-83.8926%, and Naïve Bayes-87.248%) but the prediction time was significantly low (K-NNN-230ms, J48-203ms, and Naïve Bayes-195ms).

Keywords: heart disease, feature selection, hybrid method, agglomerative clustering, classification

I.INTRODUCTION

Heart disease is a common name for various types of diseases and disorders affecting the heart and blood vessels directly. Symptoms can vary depending on the type of heart disease. Most hospitals nowadays use systems for managing patient data [3] which generate enormous amounts of data taking the form of images, text, and numbers. Besides, most of the time patients suffering from multiple diseases provide unnecessary symptoms. But all of this information is hardly used to make the right decisions for any specific kind of disease. So, it becomes challenging to turn these data into efficient and useful information for making intelligent clinical decisions. Data mining is an excellent solution for solving this type of real-life problem. Different data mining techniques with efficient algorithms can solve the problem of extracting hidden knowledge from large databases. Different tools for data mining carry out data analytics for discovering secret patterns.
The main objective of this work is to propose a method that can reduce the prediction time for heart disease. All the algorithms used for heart disease prediction take a significant amount of time. Our proposed methodology tries to solve this problem by grouping the patients with almost similar