Bayesian networks and machine learning for COVID-19 severity explanation
The COVID-19 global pandemic spurred many researchers to explore various methods to predict, forecast and analyze the impact of the COVID-19 virus. We opine that if the society has an understanding about the COVID-19 disease, its symptoms, how it spreads, and the degree of its severity across different demographics, they can better plan their lives and become resilient to the threat it poses. However, such public awareness relies on availability and access to historical data, and knowledge about the disease.
Our work addressed the information and knowledge awareness limitation about the COVID-19 disease by:
- Developing a probabilistic graphical model from COVID-19 dataset, where the nodes represent symptoms (i.e., variables or features), and the edges represent the causal relationship between them. The idea is that, since there are several symptoms collected for patients’ cases, it is intuitive to study the (in)dependence mapping between these features.
- Developing an unsupervised machine learning (ML) algorithm that learns (and extracts) the similarities between the patients’ symptoms and groups them into appropriate classes (or clusters). The task of the ML algorithm is unsupervised as it requires no human intervention on labeling the patients’ cases according to some prior knowledge. The idea is to learn unknown patterns in patients’ symptoms which might reflect their demographics.
- Developing a supervised learning algorithm which can accurately predict a patient’s demographic symptom class (i.e., age and gender) given the patient’s COVID-19 symptoms as input.