Articles | Open Access | DOI: https://doi.org/10.37547/TAJMSPR/Volume06Issue12-07

ANALYZING TRENDS AND DETERMINANTS OF LEADING CAUSES OF DEATH IN THE USA: A DATA-DRIVEN APPROACH

Saddam Hossain , Department of Management Sciences and Quantitative Methods, Gannon University, Erie, PA
Mohammed Nazmul Islam Miah , Department of Management Sciences and Quantitative Methods, Gannon University, Erie, PA
MD Sohel Rana , Executive Ph.D. in Business Analyst, University of Cumberlands
Md Sazzad Hossain , MBA, business analytics, Gannon University, Erie, PA, USA
Proshanta Kumar Bhowmik , Department of Business Analytics, Trine University, Angola, IN, USA
Md Khalilor Rahman , MBA, Business analytics, Gannon University, Erie, PA, USA
Rabeya akter , Master of Science in information technology, Washington University of Science and Technology, Alexandria, VA, USA

Abstract

The exponential escalation of the causes of death and their trends and determinants in the nation greatly define the health landscape of the United States. These causes of death, such as heart disease, cancer, chronic lower respiratory diseases, HIV &AIDS, accidents, and stroke, have been major public health concerns for many decades. Each condition represents broader societal and individual health challenges that include lifestyle choices, environmental factors, genetic predispositions, and healthcare accessibility. This research project aimed to use the data-driven approach in the exploration of these trends to understand the patterns and determinants underpinning mortality statistics. Using an expanded data set, the study presented leading causes of death; the pattern of variation by demographic factors, including age, sex, and race/ethnicity; and social, environmental, and behavioral determinants of those patterns. The datasets for our research project were retrieved from the Kaggle website, namely, "NCHS - Leading Causes of Death: United States" which was very informative regarding the major causes of death in the United States between the years 1999 and 2016. It was organized in such a way that one can analyze the trends; hence, it includes variables such as Cause of Death, such as heart disease and cancer, Year, State, Age-adjusted Death Rate, and Number of Deaths. Other demographic variables, like Sex and Race/Ethnicity, further allowed for even finer subgroups, which were very useful in highlighting disparities in health outcomes.  The performances of the three machine learning models, Linear Regression, Random Forest, and XG-Boost, based on Mean Squared Error (MSE) and R-squared (R2) were evaluated. Retrospectively, XG-Boost outperformed the other models significantly for both MSE and R2. This therefore means that on this dataset, XG-Boost is the best model that can be used for the most accurate and reliable prediction. In that respect, advanced machine learning models, applied to mortality trends, provide deep insight into the underlying determinants. Large datasets comprising demographic, socioeconomic, and health-related variables are analyzed for patterns and correlations that may not be obvious in traditional statistical methods. Model predictions can indicate future trends in mortality by highlighting populations at high risk and locations. Data-driven models hold monumental implications in public health through the provision of insights into the trends and determinants of mortality, besides including possible interventions.

Keywords

Mortality determinants, Public health trends, Leading causes of death

References

Ahsan, M. M., & Siddique, Z. (2022). Machine learning-based heart disease diagnosis: A systematic literature review. Artificial Intelligence in Medicine, 128, 102289.

Al Amin, M., Liza, I. A., Hossain, S. F., Hasan, E., Haque, M. M., & Bortty, J. C. (2024). Predicting and Monitoring Anxiety and Depression: Advanced Machine Learning Techniques for Mental Health Analysis. British Journal of Nursing Studies, 4(2), 66-75.

Alam, S., Hider, M. A., Al Mukaddim, A., Anonna, F. R., Hossain, M. S., khalilor Rahman, M., & Nasiruddin, M. (2024). Machine Learning Models for Predicting Thyroid Cancer Recurrence: A Comparative Analysis. Journal of Medical and Health Studies, 5(4), 113-129.

Al Mukaddim, A., Rahman, M. K., Sayeed, A. A., Hossain, M. S., Khan, M. T., & Ahmed, A. (2024). Genomic Predictors of Drug Sensitivity in Cancer: Integrating Genomic Data for Personalized Medicine in the USA. in Library, 1(3), 1-21.

Bhowmik, P. K., Miah, M. N. I., Uddin, M. K., Sizan, M. M. H., Pant, L., Islam, M. R., & Gurung, N. (2024). Advancing Heart Disease Prediction through Machine Learning: Techniques and Insights for Improved Cardiovascular Health. British Journal of Nursing Studies, 4(2), 35-50.

Bortty, J. C., Bhowmik, P. K., Reza, S. A., Liza, I. A., Miah, M. N. I., Chowdhury, M. S. R., & Al Amin, M. (2024). Optimizing Lung Cancer Risk Prediction with Advanced Machine Learning Algorithms and Techniques. Journal of Medical and Health Studies, 5(4), 35-48.

Cordova, I. (2024, September 4). usa_leading_causes_death. Kaggle. https://www.kaggle.com/datasets/isaaccordova/usa-leading-causes-death?select=NCHS_-_Leading_Causes_of_Death__United_States.csv

Dritsas, E., & Trigka, M. (2023). Efficient data-driven machine learning models for cardiovascular diseases risk prediction. Sensors, 23(3), 1161.

Dutta, S., Sikder, R., Islam, M. R., Al Mukaddim, A., Hider, M. A., & Nasiruddin, M. (2024). Comparing the Effectiveness of Machine Learning Algorithms in Early Chronic Kidney Disease Detection. Journal of Computer Science and Technology Studies, 6(4), 77-91.

Ekramul Hasan, Md Musa Haque, Shah Foysal Hossain, Md Al Amin, Shahriar Ahmed, Md Azharul Islam, Irin Akter Liza, & Sarmin Akter. (2024). CANCER DRUG SENSITIVITY THROUGH GENOMIC DATA: INTEGRATING INSIGHTS FOR PERSONALIZED MEDICINE IN THE USA HEALTHCARE SYSTEM. The American Journal of Medical Sciences and Pharmaceutical Research, 6(12), 36–53. https://doi.org/10.37547/TAJMSPR/Volume06Issue12-06

Hider, M. A., Nasiruddin, M., & Al Mukaddim, A. (2024). Early Disease Detection through Advanced Machine Learning Techniques: A Comprehensive Analysis and Implementation in Healthcare Systems. Revista de Inteligencia Artificial en Medicina, 15(1), 1010-1042.

Hossain, M. S., Rahman, M. K., & Dalim, H. M. (2024). Leveraging AI for Real-Time Monitoring and Prediction of Environmental Health Hazards: Protecting Public Health in the USA. Revista de Inteligencia Artificial en Medicina, 15(1), 1117-1145.

Islam, M. Z., Nasiruddin, M., Dutta, S., Sikder, R., Huda, C. B., & Islam, M. R. (2024). A Comparative Assessment of Machine Learning Algorithms for Detecting and Diagnosing Breast Cancer. Journal of Computer Science and Technology Studies, 6(2), 121-135.

Katarya, R., & Meena, S. K. (2021). Machine learning techniques for heart disease prediction: a comparative study and analysis. Health and Technology, 11(1), 87-97.

Nasiruddin, M., Dutta, S., Sikder, R., Islam, M. R., Mukaddim, A. A., & Hider, M. A. (2024). Predicting Heart Failure Survival with Machine Learning: Assessing My Risk. Journal of Computer Science and Technology Studies, 6(3), 42-55.

Nowbar, A. N., Gitto, M., Howard, J. P., Francis, D. P., & Al-Lamee, R. (2019). Mortality from ischemic heart disease: Analysis of data from the World Health Organization and coronary artery disease risk factors From NCD Risk Factor Collaboration. Circulation: cardiovascular quality and outcomes, 12(6), e005375.

Pro-AI-Rokibul. (2024). Analyze-Trends-and-Determination-of-Loeading-causes-of-deaths-in-US/Model/main.ipynb at main · proAIrokibul/Analyze-Trends-and-Determination-of-Loeading-causes-of-deaths-in-US. GitHub. https://github.com/proAIrokibul/Analyze-Trends-and-Determination-of-Loeading-causes-of-deaths-in-US/blob/main/Model/main.ipynb

Rahman, A., Karmakar, M., & Debnath, P. (2023). Predictive Analytics for Healthcare: Improving Patient Outcomes in the US through Machine Learning. Revista de Inteligencia Artificial en Medicina, 14(1), 595-624.

Su, Y. S., Ding, T. J., & Chen, M. Y. (2021). Deep learning methods in internet of medical things for valvular heart disease screening system. IEEE Internet of Things Journal, 8(23), 16921-16932.

Zandt, F. (2024, February 2). What are the leading causes of death in the U.S.? Statista Daily Data. https://www.statista.com/chart/30883/deaths-from-leading-causes-of-death-in-the-united-states/

Article Statistics

Copyright License

Download Citations

How to Cite

Saddam Hossain, Mohammed Nazmul Islam Miah, MD Sohel Rana, Md Sazzad Hossain, Proshanta Kumar Bhowmik, Md Khalilor Rahman, & Rabeya akter. (2024). ANALYZING TRENDS AND DETERMINANTS OF LEADING CAUSES OF DEATH IN THE USA: A DATA-DRIVEN APPROACH. The American Journal of Medical Sciences and Pharmaceutical Research, 6(12), 54–71. https://doi.org/10.37547/TAJMSPR/Volume06Issue12-07