Machine Learning Model Predicts COVID-19 Severity With Laboratory Data

Realistic coronavirus cells for research. COVID-19 concept. also known as 2019-nCov. 3D illustration.
Researchers evaluated the accuracy of machine learning models to predict COVID-19 severity on the basis of patient characteristics and laboratory data.

Using a combination of patient characteristics and laboratory data, a machine learning model was found to accurately predict COVID-19 severity, according to results of an analytical cross-sectional study published in Clinical Nutrition ESPEN.

Between May 2020 and May 2021, researchers used purposive sampling to enroll patients aged 16 years and older who were hospitalized at a single center for the treatment of COVID-19 infection. Fasting blood samples and demographic data were obtained from each patient on hospital admission, and COVID-19 severity was evaluated on the basis of pulmonary involvement. The researchers first used Spearman rank-order correlation to determine clinical and paraclinical characteristics significantly associated with COVID-19 severity, which were subsequently entered into machine learning models for further analysis. The machine learning models were then used to assess significant data via support vector machine (SVM), decision tree (DT), and random forest (RF) algorithms, and the accuracy of the models in predicting COVID-19 severity was evaluated.

There were 93 patients included in the study, of whom the severity of COVID-19 infection was mild in 26, moderate in 30, and severe in 37. The mean patient age was 51.38±15.75 years, 55.9% were women, 47.3% had no underlying conditions, 21.5% had cardiovascular disease, and 16.8% had diabetes.

Results showed a significant relationship between COVID-19 severity and patient age (P <.001), with severe disease more common among older patients. The researchers also noted a higher percentage of men vs women with severe disease (53.7% vs 28.8%), indicating a significant relationship between COVID-19 severity and male sex (P <.032). In regard to underlying conditions, 38.6% and 25.0% of patients with no underlying conditions had mild and severe disease, respectively (P =.039), and 60% of those with cardiovascular disease had severe disease (P <.024). A significant inverse relationship with COVID-19 severity was noted among patients with anorexia nervosa (n=18), as 55.6% of these patients had mild disease, 16.7% had moderate disease, and 27.8% had severe disease (P <.014).

Analysis of laboratory data showed a significant relationship between fasting blood glucose concentrations and COVID-19 severity (P =.028), with the highest concentrations (173.1 mg/dL) observed in patients with severe disease. There also was a significant inverse relationship between serum calcium and COVID-19 severity (P =.043), with concentrations of 9.2, 9.1, and 9.0 mg/dL noted among patients with mild, moderate, and severe disease, respectively.

The machine learning model that assessed data via a support vector machine algorithm was most effective for predicting COVID-19 severity, with a precision of 95.5%, a recall of 94%, an F1 score of 94.8%, an accuracy of 95%, and an area under the curve of 94%.

This study was limited by its small sample size and its single-center design.

Based on these findings, the researchers concluded that “clinical and paraclinical features like calcium serum levels can be used for automated severity assessment of COVID-19.”


Jahangirimehr A, Shahvali EA, Rezaeijo SM, et al. Machine learning approach for automated predicting of COVID-19 severity based on clinical and paraclinical characteristics: serum levels of zinc, calcium, and vitamin D. Clin Nutr ESPEN. Published online July 31, 2022. doi:10.1016/j.clnesp.2022.07.011