Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
Created by W.Langdon from
gp-bibliography.bib Revision:1.8110
- @Article{andelic:2021:IJERPH,
-
author = "Nikola Andelic and Sandi {Baressi Segota} and
Ivan Lorencin and Zdravko Jurilj and Tijana Sustersic and
Andela Blagojevic and Alen Protic and
Tomislav Cabov and Nenad Filipovic and Zlatan Car",
-
title = "Estimation of {COVID-19} Epidemiology Curve of the
United States Using Genetic Programming Algorithm",
-
journal = "International Journal of Environmental Research and
Public Health",
-
year = "2021",
-
volume = "18",
-
number = "3",
-
keywords = "genetic algorithms, genetic programming",
-
ISSN = "1660-4601",
-
URL = "https://www.mdpi.com/1660-4601/18/3/959",
-
DOI = "doi:10.3390/ijerph18030959",
-
abstract = "Estimation of the epidemiology curve for the COVID-19
pandemic can be a very computationally challenging
task. Thus far, there have been some implementations of
artificial intelligence (AI) methods applied to develop
epidemiology curve for a specific country. However,
most applied AI methods generated models that are
almost impossible to translate into a mathematical
equation. In this paper, the AI method called genetic
programming (GP) algorithm is used to develop a
symbolic expression (mathematical equation) which can
be used for the estimation of the epidemiology curve
for the entire U.S. with high accuracy. The GP
algorithm is used on the publicly available dataset
that contains the number of confirmed, deceased and
recovered patients for each U.S. state to obtain the
symbolic expression for the estimation of the number of
the aforementioned patient groups. The dataset consists
of the latitude and longitude of the central location
for each state and the number of patients in each of
the goal groups for each day in the period of 22
January 2020–3 December 2020. The obtained
symbolic expressions for each state are summed up to
obtain symbolic expressions for estimation of each of
the patient groups (confirmed, deceased and recovered).
These symbolic expressions are combined to obtain the
symbolic expression for the estimation of the
epidemiology curve for the entire U.S. The obtained
symbolic expressions for the estimation of the number
of confirmed, deceased and recovered patients for each
state achieved R2 score in the ranges
0.9406–0.9992, 0.9404–0.9998 and
0.9797–0.99955, respectively. These equations are
summed up to formulate symbolic expressions for the
estimation of the number of confirmed, deceased and
recovered patients for the entire U.S. with achieved R2
score of 0.9992, 0.9997 and 0.9996, respectively. Using
these symbolic expressions, the equation for the
estimation of the epidemiology curve for the entire
U.S. is formulated which achieved R2 score of 0.9933.
Investigation showed that GP algorithm can produce
symbolic expressions for the estimation of the number
of confirmed, recovered and deceased patients as well
as the epidemiology curve not only for the states but
for the entire U.S. with very high accuracy.",
-
notes = "also known as \cite{ijerph18030959}",
- }
Genetic Programming entries for
Nikola Andelic
Sandi Baressi Segota
Ivan Lorencin
Zdravko Jurilj
Tijana Geroski
Andela Blagojevic
Alen Protic
Tomislav Cabov
Nenad Filipovic
Zlatan Car
Citations