Articles | Open Access | DOI: https://doi.org/10.37547/tajet/Volume06Issue09-04

EVALUATING MACHINE LEARNING ALGORITHMS FOR BREAST CANCER DETECTION: A STUDY ON ACCURACY AND PREDICTIVE PERFORMANCE

Md Al-Imran , College of Graduate and Professional Studies Trine University, USA
Salma Akter , Department of Public Administration, Gannon University, Erie, PA, USA
Md Abu Sufian Mozumder , College of Business, Westcliff University, Irvine, California, USA
Rowsan Jahan Bhuiyan , Master of Science in Information Technology, Washington University of Science and Technology, USA
Tauhedur Rahman , Dahlkemper School of Business, Gannon University, USA
Md Jamil Ahmmed , Department of Information Technology Project Management, Business Analytics, St. Francis College, USA
Md Nazmul Hossain Mir , Master of Science in Information Technology, Washington University of Science and Technology, USA
Md Amit Hasan , Master of Science in Information Technology, Washington University of Science and Technology, USA
Ashim Chandra Das , Master of Science in Information Technology, Washington University of Science and Technology, USA
Md. Emran Hossen , Department of Science in Biomedical Engineering, Gannon University, USA

Abstract

This study evaluates several machine learning algorithms—Support Vector Machine (SVM), Random Forest, Logistic Regression, Decision Tree (C4.5), and k-Nearest Neighbors (KNN)—for breast cancer detection using the Breast Cancer Wisconsin Diagnostic dataset. We implemented comprehensive pre-processing and model evaluation with Scikit-learn in Python. Our findings show that SVM achieved the highest accuracy, with 99.9% on the training set and 98.50% on the testing set, indicating superior performance in handling high-dimensional data. Random Forest also performed well, with accuracies of 98.5% and 98.20%, respectively. Logistic Regression and Decision Tree models provided reliable predictions when tuned, while KNN was less effective. SVM and Random Forest are recommended for clinical decision support systems due to their high accuracy and robustness.

Keywords

Accuracy rates, Performance analysis, Confusion matrix

References

Naji, M. A., El Filali, S., Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine learning algorithms for breast cancer prediction and diagnosis. Procedia Computer Science, 191, 487-492.

American Cancer Society. (2023). Breast cancer. Retrieved from https://www.cancer.org/cancer/breast-cancer.html

Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., & Blau, H. M. (2019). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115-118. https://doi.org/10.1038/nature21056

Huang, C., Zhou, P., Liu, M., & Zhang, Y. (2021). Machine learning algorithms for predicting breast cancer: A systematic review. Journal of Cancer Research and Clinical Oncology, 147(6), 1557-1573. https://doi.org/10.1007/s00432-020-03428-2

Wolberg, W. H., Street, W. N., & Mangasarian, O. L. (1995). Machine learning techniques to diagnose breast cancer from DNA microarray data. Journal of Biomedical Informatics, 28(6), 477-486. https://doi.org/10.1006/jbin.1995.1036

Zhang, H., Zhang, X., & Wang, J. (2020). A comprehensive review on machine learning algorithms for medical data classification. Computers in Biology and Medicine, 122, 103787. https://doi.org/10.1016/j.compbiomed.2020.103787

Khan, R. H., Miah, J., Nipun, S. A. A., & Islam, M. (2023, March). A Comparative Study of Machine Learning classifiers to analyze the Precision of Myocardial Infarction prediction. In 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0949-0954). IEEE.

Fatima, N., Liu, L., Hong, S., & Ahmed, H. (2020). Prediction of breast cancer, comparative review of machine learning techniques, and their analysis. IEEE Access, 8, 150360-150376.

Uddin, K. M. M., Biswas, N., Rikta, S. T., & Dey, S. K. (2023). Machine learning-based diagnosis of breast cancer utilizing feature optimization technique. Computer Methods and Programs in Biomedicine Update, 3, 100098.

S. Kayyum et al., "Data Analysis on Myocardial Infarction with the help of Machine Learning Algorithms considering Distinctive or Non-Distinctive Features," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-7, doi: 10.1109/ICCCI48352.2020.9104104.

Elsadig, M. A., Altigani, A., & Elshoush, H. T. (2023). Breast cancer detection using machine learning approaches: a comparative study. International Journal of Electrical & Computer Engineering (2088-8708), 13(1).

Hasan, M., Pathan, M. K. M., & Kabir, M. F. (2024). Functionalized Mesoporous Silica Nanoparticles as Potential Drug Delivery Vehicle against Colorectal Cancer. Journal of Medical and Health Studies, 5(3), 56-62.

Hasan, M., Kabir, M. F., & Pathan, M. K. M. (2024). PEGylation of Mesoporous Silica Nanoparticles for Drug Delivery Applications. Journal of Chemistry Studies, 3(2), 01-06.

Hasan, M., & Mahama, M. T. (2024). Uncovering the complex mechanisms behind nanomaterials-based plasmon-driven photocatalysis through the utilization of Surface-Enhanced Raman Spectroscopies. arXiv preprint arXiv:2408.13927.

Arif, M., Hasan, M., Al Shiam, S. A., Ahmed, M. P., Tusher, M. I., Hossan, M. Z., ... & Imam, T. (2024). Predicting Customer Sentiment in Social Media Interactions: Analyzing Amazon Help Twitter Conversations Using Machine Learning. International Journal of Advanced Science Computing and Engineering, 6(2), 52-56.

Khan, R. H., Miah, J., Rahman, M. M., & Tayaba, M. (2023, March). A comparative study of machine learning algorithms for detecting breast cancer. In 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 647-652). IEEE.

Miah, J., Khan, R. H., Ahmed, S., & Mahmud, M. I. (2023, June). A comparative study of detecting covid 19 by using chest X-ray images–A deep learning approach. In 2023 IEEE World AI IoT Congress (AIIoT) (pp. 0311-0316). IEEE.

Khan, R. H., & Miah, J. (2022, June). Performance Evaluation of a new one-time password (OTP) scheme using stochastic petri net (SPN). In 2022 IEEE World AI IoT Congress (AIIoT) (pp. 407-412). IEEE.

Khan, R. H., Miah, J., Arafat, S. Y., Syeed, M. M., & Ca, D. M. (2023, November). Improving Traffic Density Forecasting in Intelligent Transportation Systems Using Gated Graph Neural Networks. In 2023 15th International Conference on Innovations in Information Technology (IIT) (pp. 104-109). IEEE.

Miah, J., Ca, D. M., Sayed, M. A., Lipu, E. R., Mahmud, F., & Arafat, S. Y. (2023, November). Improving Cardiovascular Disease Prediction Through Comparative Analysis of Machine Learning Models: A Case Study on Myocardial Infarction. In 2023 15th International Conference on Innovations in Information Technology (IIT) (pp. 49-54). IEEE.

R. H. Khan, J. Miah, M. A. R. Rahat, A. H. Ahmed, M. A. Shahriyar and E. R. Lipu, "A Comparative Analysis of Machine Learning Approaches for Chronic Kidney Disease Detection," 2023 8th International Conference on Electrical, Electronics and Information Engineering (ICEEIE), Malang City, Indonesia, 2023, pp. 1-6, doi: 10.1109/ICEEIE59078.2023.10334765.

Rahman, M. M., Islam, A. M., Miah, J., Ahmad, S., & Hasan, M. M. (2023, June). Empirical Analysis with Component Decomposition Methods for Cervical Cancer Risk Assessment. In 2023 IEEE World AI IoT Congress (AIIoT) (pp. 0513-0519). IEEE.

Article Statistics

Copyright License

Download Citations

How to Cite

Md Al-Imran, Salma Akter, Md Abu Sufian Mozumder, Rowsan Jahan Bhuiyan, Tauhedur Rahman, Md Jamil Ahmmed, Md Nazmul Hossain Mir, Md Amit Hasan, Ashim Chandra Das, & Md. Emran Hossen. (2024). EVALUATING MACHINE LEARNING ALGORITHMS FOR BREAST CANCER DETECTION: A STUDY ON ACCURACY AND PREDICTIVE PERFORMANCE. The American Journal of Engineering and Technology, 6(09), 22–33. https://doi.org/10.37547/tajet/Volume06Issue09-04