Data Science 2: Advanced Topics in Data Science

Final Project¶

Topic H: Dynamics of Disease Transmision and Human Behavior¶

Spring 2022¶

GROUP 10: Jaqueline Garcia-Yi, Rafael Hernandez, Hugo Munoz Sanchez, and Kerem Atalay

¶


TABLE OF CONTENTS¶

I. Background and introduction¶

II. Research question¶

III. Exploratory data analysis: trend graphs, correlations, and cluster analysis¶

IV. Model results and discussions¶

4.1. Preliminary: import commands, verify version of Tensorflow, and check GPU functioning¶

4.2. Read csv files¶

4.3. Merge files and clean data (handling of NAs)¶

4.4. Train-test split and normalization (per State)¶

4.5. Time series baseline models¶

  4.5.1 New York

  4.5.2 California

4.6. RNN models¶

  4.6.1 New York

    a) Results of one layer RNN model

    b) Results of multi- layer RNN model

  4.6.2 California

    a) Results of one layer RNN model

    b) Results of multi- layer RNN model

4.7. LSTM models¶

  4.7.1 New York

    a) Results of one layer LSTM model

    b) Results of multi- layer LSTM model

    c) Results of multi- layer LSTM model using time steps

  4.7.2 California

    a) Results of one layer LSTM model

    b) Results of multi- layer LSTM model

    c) Results of multi- layer LSTM model using time steps

V. Conclusions and way forward¶

VI. References¶





I. Background and Introduction¶

Return to contents

* Note: numbers in squared brackets in the text of the notebook indicate the reference (the full citations are in the Section VI Reference of this notebook)

Emerging infectious diseases pose a threat to humanity. Existing mechanisms for infectious disease surveillance and early response systems are not refined enough to allow governmental authorities to respond in timely fashion to outbreaks and minimize their potential impacts [1]. Over the last decade, innovative approaches have been increasingly used to predict outbreaks, including exploiting information from internet and search engines, but further research is still needed to refine those models and outbreak predictions [2]. Predicting outbreaks in a reliable fashion would allow Governments, other authorities, and the population in general to take early control measures to avoid exposure and reduce the number of people getting infected. The ability to accurately forecast outbreaks has the potential to save several human lives. Therefore, the aim of our project is to contribute to the advancement of an outbreak prediction system by using deep learning models (such as RNN and LSTM models) to forecast COVID-19 cases using mainly digital stream data (i.e., google searches on COVID-19 related and not directly related terms).

II. Research question¶

Return to contents

Our research question is:

Could the outbreak of infectious diseases (such as COVID-19) be accurately predicted by deep learning models (e.g., RNN and LSTM models) using mainly digital stream data?

III. Exploratory data analysis: trend graphs, correlations, and cluster analysis¶

Return to contents

For the exploratory data analysis (EDA), see the auxiliary jupyter notebook 1. The EDA includes trend graphs, correlations and cluster analysis results. The trend graphs were separated in four groups of predictors to easy their visualization. These groups were: (a) number of cases, deaths and hospitalizations due to COVID-19; (b) number of google searches related to COVID-19; (c) number of COVID-19 cases in other states; and (d) number of google searches not directly related to COVID-19.

IV. Model results and discussions¶

Return to contents

Our results include three types of models: (a) time series baseline models, such as simple moving average models and auto regressive moving average (ARMA) models; (b) RNN models, specifically one-layer and multi-layer RNN models; and (c) LSTM models, such as one-layer, multi-layer, two-layer using time steps, and multi-layer using time steps.

In this notebook, we present the results of two States: New York and California. We selected these States because they are two of the largest States (i.e., population size) but located at opposite coasts of US (New York in the Atlantic and California in the Pacific) with different climatic conditions. Climate and humidity conditions influence the transmision of infection disseases, such as COVID-19 [1]. Furthermore, both States received among the lowest rankings of all the States on how they handled the COVID-19 crisis [2]. Therefore, reliable early warning systems could significantly help these two States to save lives and avoid unnecessary lockdowns and economic turmoil at the same time.

The auxiliary notebook 2 includes a drop-down menu (a widget) for selecting and running the model results for any of the 50 States in US. The drop-down menu is located inmediately after sub-section 4.4 in the auxiliary notebook.

4.1. Preliminary: import commands, verify version of Tensorflow, and check GPU functioning¶

Return to contents

We formatted the notebook according to the CSCI-109B style. We worked in a shared Colab jupyter notebook. We avoid loading the 50 States before starting each session by creating a Git-Hub repository and running a command to opening the repository each time we used the notebook. On occasions, the GPU was not active, and the code ran slowly. However, we could not quickly find any other way to collaborate as a team (that were also free of charge, if possible). The versions of tensorflow and keras used for running the notebook were 2.8.

In [ ]:
# RUN THIS CELL 
import requests
from IPython.core.display import HTML
styles = requests.get(
    "https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/"
    "content/styles/cs109.css"
).text
HTML(styles)
Out[ ]:
In [ ]:
! cd $HOME
! git clone https://github.com/rafah1/COVID-data/
Cloning into 'COVID-data'...
remote: Enumerating objects: 63, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 63 (delta 0), reused 5 (delta 0), pack-reused 58
Unpacking objects: 100% (63/63), done.
In [ ]:
pip install --upgrade gap-stat
Collecting gap-stat
  Downloading gap_stat-2.0.1-py3-none-any.whl (6.9 kB)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from gap-stat) (1.3.5)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from gap-stat) (1.4.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from gap-stat) (1.21.6)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->gap-stat) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas->gap-stat) (2022.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas->gap-stat) (1.15.0)
Installing collected packages: gap-stat
Successfully installed gap-stat-2.0.1
In [ ]:
# Import necessary libraries
import os
import time
import datetime
import requests
import glob
import numpy as np
import math
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import scipy.stats as stats
%matplotlib inline

# Tensorflow
import tensorflow as tf
from tensorflow.python.keras import backend as K

# Keras
from keras.preprocessing.sequence import TimeseriesGenerator

# sklearn
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.metrics import mean_absolute_percentage_error, mean_squared_error, mean_absolute_error 
from sklearn.metrics import precision_score , recall_score
from sklearn.model_selection import train_test_split
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
from sklearn import manifold
from sklearn.impute import SimpleImputer, KNNImputer

#Cluster
import scipy.cluster.hierarchy as hac
from scipy.cluster.hierarchy import fcluster
from sklearn.cluster import KMeans, DBSCAN
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.metrics import silhouette_samples, silhouette_score
from sklearn.neighbors import NearestNeighbors
from gap_statistic import OptimalK

#Statistical models
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.tsa.arima_model import ARMA, ARIMA

#Date data
from datetime import datetime

#Numpy
from numpy.ma.core import reshape

#Table
from tabulate import tabulate
/usr/local/lib/python3.7/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm
In [ ]:
# Verify setup
print("tensorflow version", tf.__version__)
print("keras version", tf.keras.__version__)
print("Eager Execution Enabled:", tf.executing_eagerly())

# Get the number of replicas 
strategy = tf.distribute.MirroredStrategy()
print("Number of replicas:", strategy.num_replicas_in_sync)

devices = tf.config.experimental.get_visible_devices()
print("Devices:", devices)
print(tf.config.experimental.list_logical_devices('GPU'))

print("GPU Available: ", tf.config.list_physical_devices('GPU'))
print("All Physical Devices", tf.config.list_physical_devices())

AUTOTUNE = tf.data.experimental.AUTOTUNE
tensorflow version 2.8.0
keras version 2.8.0
Eager Execution Enabled: True
WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
Number of replicas: 1
Devices: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
[]
GPU Available:  []
All Physical Devices [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
In [ ]:
#Check GPU
!nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.


4.2. Read csv files¶

Return to contents

We loaded the 50 CSV files (one file per State) from the Git-Hub to the Notebook by using the glob() method. The glob() python package retrieves files matching a specific pattern (for instance, “.csv”). Then, we looped and read each of the files using the pandas.read_csv() method [1].

In [ ]:
# Increase number of displayed columns
pd.set_option('display.max_columns', 700)
  
# Use glob to read the csv files from a folder
path = os.getcwd()
#csv_files = glob.glob(os.path.join("state_level_data", "*.csv"))
csv_files = glob.glob(os.path.join("./COVID-data/state_level_data", "*.csv"))
csv_files
  
# Read csv files
file_names=[]
list_files=[]
for f in csv_files:      
    # Print name of file
    name=f.split("/")[-1]
    name=name.split(".")[0]
    #print(f'\033[1mFILE NAME: {name} \033[0m')  
    file_names.append(name)
    # read the csv files
    df = pd.read_csv(f)
    list_files.append(df)
    ####
    ## Delete the "#" below to print shape, description, data type and NAs of each individual csv file
    ####
    #print(f'\nShape: {df.shape}')
    #print('\nContent:')
    #display(df.head())
    #print('\nDescription:')
    #display(df.describe())
    #print('\nData type:')
    #display(np.transpose(pd.DataFrame(df.dtypes)))
    #print("\nNumber of NAs per column:")
    #display(np.transpose(pd.DataFrame(df.isnull().sum())))
    #print("\n")

#Display one state
print("The shape of the file of one state is", list_files[0].shape)
print("\nThe first rows of the file of one state are:")
display(list_files[0].head())
The shape of the file of one state is (744, 499)

The first rows of the file of one state are:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming
0 2020-01-01 NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0 0.0 0 0.0 0.0 0.0 0 0.000000 0.0 0 4486.684450 2.77 4.82 7.99 0.23 0.78 NaN NaN 4.56 NaN 9.14 0.13 NaN 0.41 0.26 0.28 1.59 0.81 0.14 NaN NaN 6.22 0.18 NaN 0.44 0.39 4.85 NaN 0.72 NaN 1.59 0.19 NaN 0.14 2.11 NaN 1.29 0.14 5.30 1.88 NaN NaN 0.17 0.19 0.22 4.56 NaN 0.41 2.17 0.32 0.34 0.22 1.10 2.43 0.18 0.20 0.54 0.55 0.91 0.56 1.22 1.65 0.19 0.45 1.93 NaN 0.19 3.18 0.99 0.47 0.84 NaN 0.45 0.27 0.32 1.56 0.45 0.16 0.79 0.50 0.15 NaN 0.20 0.36 0.27 15.29 0.49 NaN 0.19 0.16 0.92 3.63 0.25 9.27 NaN 3.61 NaN NaN NaN 0.95 0.41 1.85 0.26 NaN 3.95 2.14 0.43 NaN 6.78 0.16 3.79 1.70 0.44 NaN NaN 1.28 0.16 0.14 0.15 0.16 0.20 0.93 0.14 0.68 NaN 0.19 0.17 1.95 NaN 3.03 0.43 NaN NaN 0.30 0.14 0.21 NaN NaN 0.81 4.59 0.30 NaN 4.03 0.37 0.15 0.47 2.78 0.21 NaN 0.16 0.61 0.43 0.54 3.82 0.16 0.83 0.23 1.07 NaN 0.38 NaN NaN 1.20 NaN 0.2 0.30 4.89 0.49 1.38 5.13 1.49 NaN 2.96 0.20 0.28 0.36 NaN 0.16 1.61 0.16 0.70 NaN 0.52 0.84 1.08 0.53 NaN NaN NaN 0.90 NaN 0.90 0.50 NaN NaN 0.83 NaN 6.24 NaN 0.71 0.14 NaN NaN NaN NaN 0.61 0.69 0.63 NaN 0.16 1.04 1.23 NaN NaN 0.19 1.43 0.28 1.69 21.59 5.24 0.82 0.81 2.51 0.33 0.23 0.20 1.32 0.50 7.22 0.31 0.99 1.28 1.38 NaN 0.54 0.27 0.43 0.64 0.30 0.26 2.82 0.15 0.15 3.17 NaN 0.16 0.47 NaN 0.25 0.47 0.81 0.46 4.39 0.26 NaN 0.81 0.25 1.47 0.29 0.87 NaN 0.14 0.44 0.20 2.01 0.14 2.88 NaN 2.53 NaN 1.65 0.16 0.23 0.18 NaN 0.35 NaN NaN 0.26 0.72 NaN 5.15 NaN 0.48 0.23 NaN 0.32 0.45 1.38 0.48 1.03 36.13 0.53 1.25 0.15 0.17 2.05 0.69 NaN 0.56 0.80 0.79 1.40 NaN 0.84 NaN NaN 0.16 0.21 0.28 2.11 0.85 NaN NaN NaN NaN 0.55 0.18 0.20 0.43 0.25 0.30 1.24 NaN NaN 0.19 0.14 0.66 0.44 0.28 NaN 0.17 0.26 0.16 0.60 0.15 0.23 0.46 0.50 0.15 NaN 2.26 1.16 0.26 0.27 0.28 0.14 0.75 NaN 0.91 0.18 0.78 0.30 2.55 3.78 6.34 0.44 1.81 1.07 0.29 2.05 0.97 2.52 NaN NaN 0.17 NaN 0.17 0.43 NaN 2.65 NaN NaN 0.40 5.24 0.43 0.72 0.52 0.81 NaN 0.15 0.20 0.32 0.26 0.16 NaN 0.23 1.01 0.67 1.38 0.65 0.16 0.76 3.32 NaN 0.44 0.28 0.25 1.28 4.47 NaN 0.91 0.79 3.77 1.04 0.33 NaN NaN NaN 0.86 0.14 0.56 3.46 1.22 0.20 0.31 3.36 0.39 2.36 0.32 NaN 0.40 0.46 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2020-01-02 NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1904.56187 0.0 0 0.0 0 0.0 0.0 0.0 0 1906.943228 0.0 0 1904.559254 2.71 4.98 8.64 0.26 0.85 NaN NaN 3.74 NaN 10.44 NaN 0.20 0.31 0.26 0.32 1.93 0.99 0.22 NaN NaN 6.84 0.19 NaN 0.41 0.31 5.08 NaN 0.76 0.18 2.37 0.26 NaN 0.18 2.92 NaN 1.38 NaN 5.83 2.16 NaN NaN 0.38 0.27 0.18 4.46 0.13 0.52 2.26 0.49 0.36 0.16 1.11 2.65 0.18 0.38 0.57 0.39 0.86 0.65 1.29 1.41 0.25 0.41 1.72 NaN NaN 3.57 0.91 0.46 0.93 0.12 0.54 0.31 0.39 1.65 0.39 NaN 0.99 0.64 0.17 0.12 0.22 0.52 0.27 15.25 0.58 0.19 0.18 0.22 0.97 3.56 0.26 9.74 NaN 3.43 NaN 0.20 NaN 1.05 0.46 1.85 0.19 NaN 4.20 2.32 0.36 0.18 8.24 0.23 3.82 1.66 0.51 0.12 0.17 1.48 0.23 0.35 0.14 0.12 0.24 0.84 0.27 0.92 0.22 0.24 0.31 2.43 NaN 3.52 0.39 NaN 0.12 0.28 0.17 0.23 0.26 0.14 0.82 4.88 0.46 0.13 4.05 0.48 0.13 0.62 2.41 0.20 0.17 0.22 0.60 0.58 0.75 4.00 0.18 1.03 0.13 1.10 0.12 0.42 NaN 0.13 1.19 0.13 NaN 0.28 4.76 0.60 1.59 4.68 1.95 0.18 3.29 0.21 0.29 0.53 NaN 0.15 1.54 0.16 0.95 0.17 0.46 0.93 1.06 0.83 0.15 NaN NaN 1.47 NaN 1.04 0.64 0.19 0.17 0.80 NaN 8.67 0.23 1.03 0.25 0.24 NaN NaN NaN 0.64 0.76 0.64 NaN 0.26 1.25 1.64 0.19 0.13 0.23 1.49 0.41 1.63 25.42 5.73 0.94 0.81 2.95 0.35 0.29 0.23 1.42 0.55 7.43 0.46 1.32 1.46 1.40 NaN 0.66 0.29 0.37 0.84 0.20 0.52 2.91 0.16 0.24 3.99 0.15 0.15 0.52 0.17 0.32 0.71 0.93 0.48 4.72 0.19 NaN 0.99 0.17 1.28 0.32 0.74 NaN 0.27 0.58 0.23 2.38 0.14 2.70 NaN 2.44 0.13 1.55 0.20 0.28 0.20 NaN 0.35 NaN 0.22 0.29 0.70 NaN 6.06 NaN 0.65 0.29 0.24 0.21 0.76 1.50 0.56 1.08 37.69 0.47 1.32 0.12 0.14 1.99 0.83 0.14 0.75 0.66 0.86 1.68 NaN 0.80 NaN NaN 0.13 0.33 0.19 2.07 0.99 0.20 NaN 0.14 0.13 0.60 0.24 0.20 0.56 0.23 0.34 1.59 0.14 0.13 0.37 0.17 0.71 0.59 0.37 NaN 0.17 0.31 0.16 0.63 0.12 0.23 0.51 0.37 0.19 NaN 2.79 1.17 0.30 0.29 0.21 NaN 0.77 NaN 0.88 0.16 0.82 0.21 2.67 4.38 6.49 0.41 1.87 1.32 0.36 2.42 0.91 2.69 NaN 0.12 0.22 NaN 0.26 0.45 NaN 2.66 0.14 NaN 0.36 5.24 0.53 1.05 0.65 0.86 NaN 0.17 0.19 0.32 0.33 0.37 0.14 0.20 0.90 0.73 1.18 0.60 0.22 1.07 4.47 0.22 0.58 0.32 0.35 1.41 5.18 NaN 0.84 0.75 3.93 1.15 0.33 NaN NaN 0.17 1.07 0.19 0.45 3.10 1.36 0.21 0.49 3.56 0.42 2.41 0.37 0.16 0.45 0.33 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2020-01-03 NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0 0.0 0 0.0 0.0 0.0 0 0.000000 0.0 0 3842.010974 2.44 4.85 7.91 0.35 0.80 0.12 NaN 3.50 NaN 9.60 0.18 0.17 0.27 0.26 0.39 2.00 0.89 0.23 NaN NaN 6.74 0.15 0.12 0.46 0.34 4.60 0.21 0.84 0.12 2.12 0.25 0.19 0.13 2.84 NaN 1.25 NaN 5.34 2.07 0.12 NaN 0.18 0.25 0.17 4.17 0.14 0.49 2.15 0.45 0.42 0.16 1.06 2.79 0.17 0.17 0.41 0.53 0.85 0.61 1.13 1.47 0.19 0.43 1.67 NaN NaN 3.53 0.91 0.32 0.73 NaN 0.54 0.23 0.31 1.47 0.35 NaN 0.85 0.66 NaN NaN 0.23 0.44 0.28 14.37 0.53 0.14 0.21 0.20 1.00 3.40 0.32 9.18 0.12 3.40 NaN 0.19 0.15 1.19 0.45 1.74 0.25 NaN 3.71 2.27 0.36 NaN 8.24 0.21 3.67 1.51 0.59 NaN NaN 1.49 NaN 0.28 0.19 NaN 0.23 0.85 0.16 0.83 0.20 0.17 0.28 2.31 NaN 3.44 0.45 NaN NaN 0.22 0.22 0.23 0.18 0.19 0.74 4.35 0.46 0.19 3.87 0.41 0.15 0.48 2.92 0.20 NaN NaN 0.57 0.50 0.59 4.07 NaN 1.01 0.24 1.10 0.14 0.31 NaN NaN 1.05 0.14 NaN 0.15 4.48 0.68 1.57 4.12 1.85 NaN 3.10 0.26 0.37 0.39 NaN 0.17 1.52 NaN 0.87 0.12 0.47 0.84 1.13 0.73 0.12 NaN NaN 1.42 0.14 1.14 0.56 0.20 0.12 0.79 NaN 7.81 0.16 0.90 0.23 0.16 NaN 0.13 NaN 0.68 0.69 0.72 NaN 0.25 1.20 1.54 0.17 0.17 0.19 1.54 0.29 1.69 24.01 5.54 0.90 0.75 2.97 0.31 0.28 0.23 1.35 0.53 7.07 0.42 1.29 1.48 1.27 NaN 0.62 0.12 0.42 0.95 0.24 0.46 2.89 0.16 0.22 3.70 0.24 0.18 0.61 0.15 0.26 0.56 0.69 0.43 4.26 0.24 0.13 0.93 0.21 1.36 0.36 0.63 NaN 0.30 0.51 NaN 2.27 NaN 2.63 NaN 2.34 0.18 1.47 NaN 0.27 0.20 0.13 0.40 NaN NaN 0.38 0.57 NaN 5.43 0.20 0.53 0.31 0.14 0.25 0.73 1.41 0.46 0.99 35.84 0.54 1.30 NaN 0.13 1.84 0.82 NaN 0.58 0.72 0.80 1.81 NaN 0.83 NaN NaN 0.15 0.21 0.23 2.12 0.77 0.15 NaN 0.18 0.12 0.55 0.24 0.22 0.56 0.15 0.30 1.59 NaN NaN 0.31 0.25 0.68 0.60 0.37 NaN 0.12 0.34 0.21 0.60 0.14 0.22 0.45 0.36 0.16 0.15 2.41 1.18 0.40 0.36 0.20 NaN 0.66 NaN 0.95 0.16 0.72 0.21 2.45 4.14 6.04 0.47 1.78 1.28 0.23 2.30 0.90 2.42 NaN 0.15 0.25 0.13 0.31 0.47 NaN 2.50 0.17 NaN 0.29 5.12 0.53 1.05 0.47 0.85 0.12 0.17 0.21 0.33 0.24 0.36 0.13 0.24 0.90 0.74 1.31 0.58 0.12 0.86 4.48 0.15 0.54 0.34 0.32 1.39 5.21 0.16 0.94 0.69 3.63 1.06 0.36 NaN NaN 0.23 0.89 0.27 0.35 3.04 1.36 0.31 0.43 3.06 0.35 2.01 0.29 0.15 0.41 0.24 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2020-01-04 NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0 0.0 0 0.0 0.0 0.0 0 0.000000 0.0 0 7949.995496 2.71 4.80 8.31 0.22 0.67 NaN NaN 3.51 NaN 9.22 NaN NaN 0.34 0.28 0.26 1.62 0.84 0.20 0.17 NaN 6.34 0.19 NaN 0.39 0.32 4.81 NaN 0.85 NaN 1.72 0.21 NaN NaN 2.30 NaN 1.23 NaN 4.90 2.01 0.13 NaN 0.20 0.23 0.17 3.82 0.15 0.46 2.18 0.48 0.30 0.18 1.13 2.33 NaN 0.19 0.46 0.56 0.85 0.52 1.06 1.31 0.26 0.43 1.64 NaN 0.15 3.36 1.11 0.38 0.82 NaN 0.43 0.22 0.32 1.55 0.49 NaN 0.72 0.67 NaN NaN 0.22 0.49 0.20 14.13 0.44 NaN 0.24 NaN 0.95 3.80 0.21 8.22 NaN 3.71 NaN 0.19 NaN 1.31 0.40 1.81 0.20 NaN 3.59 2.46 0.45 NaN 7.01 0.21 3.85 1.47 0.60 NaN NaN 1.46 0.13 0.33 NaN NaN 0.14 0.90 0.16 0.65 0.18 0.15 0.22 1.91 NaN 3.29 0.44 NaN NaN 0.29 0.21 0.21 0.14 0.13 0.77 4.39 0.42 0.14 3.66 0.38 NaN 0.57 2.53 0.17 NaN 0.20 0.51 0.40 0.55 3.99 0.24 0.83 0.13 1.02 NaN 0.36 NaN NaN 1.11 NaN NaN 0.39 5.18 0.62 1.42 4.18 1.61 NaN 3.21 0.25 0.23 0.36 NaN NaN 1.59 NaN 0.65 NaN 0.68 0.76 1.01 0.69 0.17 NaN NaN 1.02 NaN 0.92 0.50 NaN 0.15 0.81 NaN 6.52 NaN 0.77 NaN 0.15 NaN NaN NaN 0.60 0.76 0.63 NaN 0.16 1.08 1.37 0.13 0.14 0.21 1.31 0.26 1.53 22.57 5.27 0.55 0.85 2.86 0.30 0.23 0.22 1.25 0.43 7.38 0.35 1.05 1.38 1.28 NaN 0.61 NaN 0.56 0.76 0.28 0.57 2.72 0.14 0.18 3.12 NaN NaN 0.51 0.17 0.26 0.60 0.69 0.38 4.07 0.23 NaN 0.67 0.20 1.11 0.18 0.80 0.17 0.22 0.46 0.19 1.92 NaN 2.60 NaN 2.46 NaN 1.50 0.16 0.22 0.22 NaN 0.34 NaN 0.16 0.25 0.49 NaN 4.74 NaN 0.50 0.22 NaN 0.25 0.53 1.48 0.46 1.01 35.48 0.50 1.37 0.15 0.14 1.81 0.62 NaN 0.54 0.80 0.74 1.72 NaN 0.81 NaN 0.15 0.14 0.21 0.23 1.81 0.93 0.18 NaN 0.18 0.13 0.61 0.18 0.15 0.46 0.23 0.30 1.44 NaN NaN 0.23 0.18 0.66 0.55 0.32 0.13 0.22 0.27 0.14 0.67 0.14 0.25 0.43 0.30 NaN NaN 2.48 1.05 0.29 0.40 0.21 NaN 0.72 NaN 0.97 0.18 0.69 0.25 2.38 3.84 6.17 0.45 1.64 1.00 0.38 2.49 0.85 2.16 NaN 0.16 0.23 0.14 0.25 0.62 NaN 2.08 0.16 NaN 0.35 4.96 0.50 0.85 0.49 0.70 NaN 0.17 0.21 0.29 0.33 0.29 NaN 0.21 0.92 0.73 1.19 0.63 0.20 0.79 3.68 0.23 0.53 0.30 0.28 1.12 4.67 NaN 1.00 0.78 3.71 1.05 0.33 NaN NaN 0.13 0.99 NaN 0.28 3.03 1.19 0.22 0.43 3.12 0.33 2.53 0.37 0.23 0.44 0.29 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2020-01-05 NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0 0.0 0 0.0 0.0 0.0 0 0.000000 0.0 0 6054.562803 2.90 4.99 8.58 0.23 0.65 NaN NaN 3.30 NaN 8.81 0.16 NaN 0.38 0.29 0.38 1.57 0.88 NaN NaN NaN 6.72 NaN NaN 0.36 0.33 4.77 0.16 0.80 NaN 1.55 0.24 NaN NaN 2.48 NaN 1.21 NaN 5.11 1.80 NaN 0.17 0.33 0.18 0.17 3.99 0.23 0.62 2.40 0.41 0.29 0.25 1.24 3.42 0.34 0.38 0.46 0.64 0.93 0.69 0.93 1.60 NaN 0.35 1.79 NaN 0.25 3.49 0.97 0.46 0.68 NaN 0.40 0.22 0.28 1.55 0.37 NaN 0.73 0.55 NaN NaN 0.25 0.41 0.21 13.39 0.42 NaN 0.24 NaN 0.97 3.83 0.24 8.17 NaN 3.71 NaN 0.18 NaN 1.40 0.40 1.77 0.29 NaN 4.02 2.45 0.41 NaN 7.33 0.21 3.50 1.63 0.47 NaN NaN 1.41 0.13 0.23 NaN NaN 0.27 0.72 NaN 0.75 NaN 0.23 0.20 2.10 NaN 2.78 0.45 NaN NaN 0.30 0.14 0.24 0.20 0.13 0.80 4.74 0.42 0.14 3.71 0.41 NaN 0.56 2.56 0.19 NaN 0.16 0.65 0.55 0.63 3.83 0.31 1.04 0.16 1.16 0.14 0.38 0.18 NaN 1.01 NaN NaN 0.31 4.90 0.61 1.46 4.65 1.56 0.15 3.04 0.20 0.23 0.39 NaN 0.18 1.41 NaN 0.60 NaN 0.52 0.74 1.08 0.66 NaN NaN NaN 0.99 NaN 1.00 0.43 0.15 0.14 0.95 NaN 6.84 NaN 0.76 0.15 0.19 NaN NaN NaN 0.62 0.75 0.54 NaN 0.18 1.13 1.41 0.17 NaN 0.18 1.53 0.23 1.49 22.45 5.30 0.79 0.85 3.06 0.34 0.40 0.23 1.27 0.50 7.56 0.34 1.04 1.32 1.40 NaN 0.69 0.13 0.42 0.72 0.29 0.39 2.72 0.17 0.18 3.05 NaN NaN 0.48 0.15 0.16 0.42 0.86 0.31 4.25 0.24 NaN 0.94 0.20 1.21 0.24 0.82 NaN NaN 0.50 0.13 2.04 NaN 2.44 NaN 2.35 NaN 1.55 0.13 0.28 0.24 NaN 0.35 NaN NaN 0.18 0.49 0.13 4.91 0.17 0.51 0.29 NaN 0.21 0.52 1.39 0.34 0.82 36.06 0.42 1.36 0.13 0.17 1.90 0.65 NaN 0.71 0.83 0.68 1.82 NaN 0.80 NaN NaN NaN 0.18 0.35 1.62 0.92 NaN NaN 0.14 0.18 0.54 0.17 NaN 0.60 0.20 0.18 1.44 NaN NaN 0.23 NaN 0.71 0.61 0.34 NaN 0.14 0.25 0.14 0.60 0.13 0.20 0.51 0.43 0.18 NaN 2.86 1.07 0.25 0.25 0.22 NaN 0.66 NaN 0.98 0.19 0.69 0.28 2.25 3.84 6.10 0.44 1.84 0.98 0.33 2.52 0.99 2.34 NaN NaN 0.19 0.14 0.26 0.53 0.14 2.15 0.15 0.16 0.31 5.25 0.47 0.92 0.53 0.80 0.13 0.23 0.21 0.36 0.28 0.18 NaN 0.18 0.91 0.68 1.29 0.60 0.16 0.90 3.93 0.23 0.55 0.16 0.23 1.25 4.88 NaN 0.85 0.74 3.75 1.08 0.36 NaN 0.18 0.17 0.98 0.19 0.29 3.06 1.46 0.27 0.39 3.45 0.25 2.38 0.35 NaN 0.57 0.28 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

4.3. Merge files and clean data (handling of NAs)¶

Return to contents

After reading each of the 50 CSV files, we included a “State” column, which corresponded to the name of the State for each file. Then, we cleaned the data. We decided to keep negative values as the data reported by Google Trends are Z-scores, which can include negative values. For filling out the NAs, we evaluated different alternatives (replacing them by zero, mean values, and interpolation). We decided not to replace NA by zeros. For instance, the number of COVID cases on 25 December 2021 is stated as NAs in California. It is unlikely there were zero cases during Christmas, but likely that there was not reporting that day due to the festivities. We also discarded using mean values to replace NAs, as trend graphs of the predictors with missing values showed abrupt changes when using them. So, we finally decided to use interpolation to replace NAs. We tried different interpolation methods: linear, spline (degrees 2 to 4) and polynomial (degrees 2 to 4). We selected linear based on a visual inspection of the trend graphs of the predictors with missing values (minimal abrupt changes in comparison to the other types of interpolation). After cleaning each file, we merged them in one large database to make all the data ready and easy to use for modelling.

In [ ]:
# Include state name as column in each csv file
n=0
for i in list_files:
    i["State_Name"]=file_names[n]
    n=n+1

# Print file before cleaning
print("\nNumber of NAs per column of the file of one State before cleaning:")
display(np.transpose(pd.DataFrame(list_files[0].isnull().sum())))

list_files_clean=[]
#Clean data 
for i in list_files:
    i.interpolate('index', limit_direction='both', inplace=True)
    list_files_clean.append(i) 

# Print file after cleaning
print("\nNumber of NAs per column of the file of one State after cleaning:")
display(np.transpose(pd.DataFrame(list_files_clean[0].isnull().sum())))
Number of NAs per column of the file of one State before cleaning:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming State_Name
0 0 26 26 176 368 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 13 13 15 19 588 473 13 707 13 89 217 13 16 13 13 13 58 709 474 13 60 689 13 14 13 98 13 401 13 18 144 238 13 625 13 125 13 13 282 653 43 19 39 13 151 13 13 13 13 85 13 13 112 17 13 13 13 13 13 13 18 13 13 698 90 13 13 13 13 541 14 40 24 13 17 328 13 13 275 669 28 13 16 13 13 408 21 78 13 13 14 13 675 13 683 177 569 13 13 13 17 703 13 13 13 405 13 26 13 13 13 634 203 13 488 14 137 235 69 13 94 13 111 65 29 13 648 13 13 192 541 23 88 36 62 224 13 13 13 111 13 13 503 13 13 39 503 65 13 13 13 13 139 13 59 13 637 13 572 276 13 169 188 15 13 13 13 13 13 170 13 47 14 13 644 581 13 555 13 617 13 13 13 13 136 506 612 13 560 13 13 164 164 13 666 13 25 13 98 45 389 141 456 13 13 13 352 15 13 13 18 73 70 13 13 13 13 13 13 13 13 13 21 23 13 13 13 13 13 13 13 310 13 644 13 13 32 13 13 65 39 13 354 176 13 50 15 13 13 13 13 22 219 13 29 13 14 13 288 22 13 52 13 694 13 611 13 348 13 51 14 38 243 21 659 317 17 13 626 13 469 13 21 186 166 13 13 13 13 13 13 13 174 202 13 13 608 13 13 13 13 402 13 637 663 178 35 124 13 13 173 709 108 171 13 50 61 13 33 14 13 637 637 13 41 13 13 13 663 344 14 136 13 645 16 13 13 90 682 13 13 23 13 37 666 13 711 13 59 13 15 13 13 13 13 13 13 15 13 13 13 555 229 124 543 18 13 654 13 117 537 14 13 13 13 13 13 575 51 174 16 28 13 250 19 13 13 13 13 355 13 13 39 13 89 24 13 13 674 13 13 13 13 13 224 646 43 13 648 81 13 13 17 13 13 15 13 13 122 13 13 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 0
Number of NAs per column of the file of one State after cleaning:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming State_Name
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [ ]:
# Join all files
df_all=list_files_clean[0]
for i in range(1,len(list_files_clean)):
    df_all=pd.concat([df_all, list_files_clean[i]], axis=0)
df_all=df_all.fillna(0) 

# Convert date column from object to date format
df_all['date'] = pd.to_datetime(df_all['date'], dayfirst=True)

# Create one hot encoder for state    
df_f=pd.get_dummies(df_all, columns=["State_Name"]) 

# Print merged file
print(f'\nShape of merged file with 50 States: {df_f.shape}')
print('\nContent of merged file with 50 States:')
display(df_f.head())
print('\nDescription of merged file with 50 States:')
display(df_f.describe())
print('\nData type of merged file with 50 States:')
display(np.transpose(pd.DataFrame(df_f.dtypes)))
print("\nNumber of NAs per column of merged file with 50 States after cleaning:")
display(np.transpose(pd.DataFrame(df_f.isnull().sum())))
Shape of merged file with 50 States: (37201, 550)

Content of merged file with 50 States:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming neighbor_Mississippi State_Name_Alabama State_Name_Alaska State_Name_Arizona State_Name_Arkansas State_Name_California State_Name_Colorado State_Name_Connecticut State_Name_Delaware State_Name_Florida State_Name_Georgia State_Name_Hawaii State_Name_Idaho State_Name_Illinois State_Name_Indiana State_Name_Iowa State_Name_Kansas State_Name_Kentucky State_Name_Louisiana State_Name_Maine State_Name_Maryland State_Name_Massachusetts State_Name_Michigan State_Name_Minnesota State_Name_Mississippi State_Name_Missouri State_Name_Montana State_Name_Nebraska State_Name_Nevada State_Name_New Hampshire State_Name_New Jersey State_Name_New Mexico State_Name_New York State_Name_North Carolina State_Name_North Dakota State_Name_Ohio State_Name_Oklahoma State_Name_Oregon State_Name_Pennsylvania State_Name_Rhode Island State_Name_South Carolina State_Name_South Dakota State_Name_Tennessee State_Name_Texas State_Name_Utah State_Name_Vermont State_Name_Virginia State_Name_Washington State_Name_West Virginia State_Name_Wisconsin State_Name_Wyoming
0 2020-01-01 0.0 0.0 0.0 11.380448 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 4486.684450 2.77 4.82 7.99 0.23 0.78 0.120000 0.12 4.56 0.11 9.14 0.130 0.200000 0.41 0.26 0.28 1.59 0.81 0.14 0.17 0.11 6.22 0.180 0.1200 0.44 0.39 4.85 0.210 0.72 0.18 1.59 0.19 0.19 0.1400 2.11 0.15 1.29 0.140 5.30 1.88 0.12 0.17 0.17 0.19 0.22 4.56 0.13 0.41 2.17 0.32 0.34 0.22 1.10 2.43 0.180 0.20 0.54 0.55 0.91 0.56 1.22 1.65 0.190 0.45 1.93 0.11 0.190000 3.18 0.99 0.47 0.84 0.120000 0.45 0.27 0.32 1.56 0.45 0.160 0.79 0.50 0.15 0.120 0.20 0.36 0.27 15.29 0.49 0.190 0.19 0.160000 0.92 3.63 0.25 9.27 0.120 3.61 0.11 0.20 0.150000 0.95 0.41 1.85 0.26 0.19 3.95 2.14 0.43 0.18 6.78 0.16 3.79 1.70 0.44 0.12 0.170000 1.28 0.16 0.14 0.15 0.160 0.20 0.93 0.14 0.68 0.22 0.19 0.17 1.95 0.11 3.03 0.43 0.15 0.12 0.30 0.14 0.21 0.26 0.14 0.81 4.59 0.30 0.13 4.03 0.37 0.150000 0.47 2.78 0.21 0.17 0.16 0.61 0.43 0.54 3.82 0.16 0.83 0.23 1.07 0.12 0.38 0.18 0.130 1.20 0.130 0.200 0.30 4.89 0.49 1.38 5.13 1.49 0.18 2.96 0.20 0.28 0.36 0.11 0.160 1.61 0.160 0.70 0.1700 0.52 0.84 1.08 0.53 0.150 0.13 0.11 0.90 0.140000 0.90 0.50 0.190 0.17 0.83 0.25 6.24 0.230000 0.71 0.14 0.24 0.13 0.130 0.12 0.61 0.69 0.63 0.18 0.16 1.04 1.23 0.19 0.130 0.19 1.43 0.28 1.69 21.59 5.24 0.82 0.81 2.51 0.33 0.23 0.20 1.32 0.50 7.22 0.31 0.99 1.28 1.38 0.11 0.54 0.270 0.43 0.64 0.30 0.26 2.82 0.15 0.15 3.17 0.15 0.16 0.47 0.17 0.25 0.47 0.81 0.46 4.39 0.26 0.13 0.81 0.25 1.47 0.29 0.87 0.17 0.14 0.44 0.20 2.01 0.140 2.88 0.13 2.53 0.13 1.65 0.16 0.23 0.18 0.13 0.35 0.11 0.2200 0.26 0.72 0.13 5.15 0.200 0.48 0.23 0.240000 0.32 0.45 1.38 0.48 1.03 36.13 0.53 1.25 0.150 0.17 2.05 0.69 0.140 0.56 0.80 0.79 1.40 0.11 0.84 0.14 0.15 0.160 0.21 0.28 2.11 0.85 0.20 0.11 0.14 0.13 0.55 0.18 0.20 0.43 0.25 0.30 1.24 0.140000 0.130 0.19 0.14 0.66 0.44 0.28 0.13 0.17 0.26 0.16 0.60 0.15 0.23 0.46 0.50 0.15 0.15 2.26 1.16 0.26 0.27 0.28 0.140 0.75 0.11 0.91 0.18 0.78 0.30 2.55 3.78 6.34 0.44 1.81 1.07 0.29 2.05 0.97 2.52 0.15 0.12 0.17 0.13 0.17 0.43 0.14 2.65 0.14 0.16 0.40 5.24 0.43 0.72 0.52 0.81 0.120 0.15 0.20 0.32 0.26 0.16 0.14 0.23 1.01 0.67 1.38 0.65 0.16 0.76 3.32 0.22 0.44 0.28 0.25 1.28 4.47 0.160000 0.91 0.79 3.77 1.04 0.33 0.18 0.18 0.17 0.86 0.14 0.56 3.46 1.22 0.20 0.31 3.36 0.39 2.36 0.32 0.160 0.40 0.46 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 2020-01-02 0.0 0.0 0.0 11.380448 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1904.56187 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1906.943228 0.0 0.0 1904.559254 2.71 4.98 8.64 0.26 0.85 0.120000 0.12 3.74 0.11 10.44 0.155 0.200000 0.31 0.26 0.32 1.93 0.99 0.22 0.17 0.11 6.84 0.190 0.1200 0.41 0.31 5.08 0.210 0.76 0.18 2.37 0.26 0.19 0.1800 2.92 0.15 1.38 0.136 5.83 2.16 0.12 0.17 0.38 0.27 0.18 4.46 0.13 0.52 2.26 0.49 0.36 0.16 1.11 2.65 0.180 0.38 0.57 0.39 0.86 0.65 1.29 1.41 0.250 0.41 1.72 0.11 0.176667 3.57 0.91 0.46 0.93 0.120000 0.54 0.31 0.39 1.65 0.39 0.152 0.99 0.64 0.17 0.120 0.22 0.52 0.27 15.25 0.58 0.190 0.18 0.220000 0.97 3.56 0.26 9.74 0.120 3.43 0.11 0.20 0.150000 1.05 0.46 1.85 0.19 0.19 4.20 2.32 0.36 0.18 8.24 0.23 3.82 1.66 0.51 0.12 0.170000 1.48 0.23 0.35 0.14 0.120 0.24 0.84 0.27 0.92 0.22 0.24 0.31 2.43 0.11 3.52 0.39 0.15 0.12 0.28 0.17 0.23 0.26 0.14 0.82 4.88 0.46 0.13 4.05 0.48 0.130000 0.62 2.41 0.20 0.17 0.22 0.60 0.58 0.75 4.00 0.18 1.03 0.13 1.10 0.12 0.42 0.18 0.130 1.19 0.130 0.188 0.28 4.76 0.60 1.59 4.68 1.95 0.18 3.29 0.21 0.29 0.53 0.11 0.150 1.54 0.160 0.95 0.1700 0.46 0.93 1.06 0.83 0.150 0.13 0.11 1.47 0.140000 1.04 0.64 0.190 0.17 0.80 0.25 8.67 0.230000 1.03 0.25 0.24 0.13 0.130 0.12 0.64 0.76 0.64 0.18 0.26 1.25 1.64 0.19 0.130 0.23 1.49 0.41 1.63 25.42 5.73 0.94 0.81 2.95 0.35 0.29 0.23 1.42 0.55 7.43 0.46 1.32 1.46 1.40 0.11 0.66 0.290 0.37 0.84 0.20 0.52 2.91 0.16 0.24 3.99 0.15 0.15 0.52 0.17 0.32 0.71 0.93 0.48 4.72 0.19 0.13 0.99 0.17 1.28 0.32 0.74 0.17 0.27 0.58 0.23 2.38 0.140 2.70 0.13 2.44 0.13 1.55 0.20 0.28 0.20 0.13 0.35 0.11 0.2200 0.29 0.70 0.13 6.06 0.200 0.65 0.29 0.240000 0.21 0.76 1.50 0.56 1.08 37.69 0.47 1.32 0.120 0.14 1.99 0.83 0.140 0.75 0.66 0.86 1.68 0.11 0.80 0.14 0.15 0.130 0.33 0.19 2.07 0.99 0.20 0.11 0.14 0.13 0.60 0.24 0.20 0.56 0.23 0.34 1.59 0.140000 0.130 0.37 0.17 0.71 0.59 0.37 0.13 0.17 0.31 0.16 0.63 0.12 0.23 0.51 0.37 0.19 0.15 2.79 1.17 0.30 0.29 0.21 0.138 0.77 0.11 0.88 0.16 0.82 0.21 2.67 4.38 6.49 0.41 1.87 1.32 0.36 2.42 0.91 2.69 0.15 0.12 0.22 0.13 0.26 0.45 0.14 2.66 0.14 0.16 0.36 5.24 0.53 1.05 0.65 0.86 0.120 0.17 0.19 0.32 0.33 0.37 0.14 0.20 0.90 0.73 1.18 0.60 0.22 1.07 4.47 0.22 0.58 0.32 0.35 1.41 5.18 0.160000 0.84 0.75 3.93 1.15 0.33 0.18 0.18 0.17 1.07 0.19 0.45 3.10 1.36 0.21 0.49 3.56 0.42 2.41 0.37 0.160 0.45 0.33 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2020-01-03 0.0 0.0 0.0 11.380448 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 3842.010974 2.44 4.85 7.91 0.35 0.80 0.120000 0.12 3.50 0.11 9.60 0.180 0.170000 0.27 0.26 0.39 2.00 0.89 0.23 0.17 0.11 6.74 0.150 0.1200 0.46 0.34 4.60 0.210 0.84 0.12 2.12 0.25 0.19 0.1300 2.84 0.15 1.25 0.132 5.34 2.07 0.12 0.17 0.18 0.25 0.17 4.17 0.14 0.49 2.15 0.45 0.42 0.16 1.06 2.79 0.170 0.17 0.41 0.53 0.85 0.61 1.13 1.47 0.190 0.43 1.67 0.11 0.163333 3.53 0.91 0.32 0.73 0.126667 0.54 0.23 0.31 1.47 0.35 0.144 0.85 0.66 0.16 0.125 0.23 0.44 0.28 14.37 0.53 0.140 0.21 0.200000 1.00 3.40 0.32 9.18 0.120 3.40 0.11 0.19 0.150000 1.19 0.45 1.74 0.25 0.19 3.71 2.27 0.36 0.18 8.24 0.21 3.67 1.51 0.59 0.12 0.161667 1.49 0.18 0.28 0.19 0.135 0.23 0.85 0.16 0.83 0.20 0.17 0.28 2.31 0.11 3.44 0.45 0.15 0.13 0.22 0.22 0.23 0.18 0.19 0.74 4.35 0.46 0.19 3.87 0.41 0.150000 0.48 2.92 0.20 0.17 0.21 0.57 0.50 0.59 4.07 0.21 1.01 0.24 1.10 0.14 0.31 0.18 0.132 1.05 0.140 0.176 0.15 4.48 0.68 1.57 4.12 1.85 0.17 3.10 0.26 0.37 0.39 0.11 0.170 1.52 0.164 0.87 0.1200 0.47 0.84 1.13 0.73 0.120 0.13 0.11 1.42 0.140000 1.14 0.56 0.200 0.12 0.79 0.25 7.81 0.160000 0.90 0.23 0.16 0.13 0.130 0.12 0.68 0.69 0.72 0.18 0.25 1.20 1.54 0.17 0.170 0.19 1.54 0.29 1.69 24.01 5.54 0.90 0.75 2.97 0.31 0.28 0.23 1.35 0.53 7.07 0.42 1.29 1.48 1.27 0.11 0.62 0.120 0.42 0.95 0.24 0.46 2.89 0.16 0.22 3.70 0.24 0.18 0.61 0.15 0.26 0.56 0.69 0.43 4.26 0.24 0.13 0.93 0.21 1.36 0.36 0.63 0.17 0.30 0.51 0.21 2.27 0.135 2.63 0.13 2.34 0.18 1.47 0.18 0.27 0.20 0.13 0.40 0.11 0.1900 0.38 0.57 0.13 5.43 0.200 0.53 0.31 0.140000 0.25 0.73 1.41 0.46 0.99 35.84 0.54 1.30 0.135 0.13 1.84 0.82 0.135 0.58 0.72 0.80 1.81 0.11 0.83 0.14 0.15 0.150 0.21 0.23 2.12 0.77 0.15 0.11 0.18 0.12 0.55 0.24 0.22 0.56 0.15 0.30 1.59 0.141429 0.134 0.31 0.25 0.68 0.60 0.37 0.13 0.12 0.34 0.21 0.60 0.14 0.22 0.45 0.36 0.16 0.15 2.41 1.18 0.40 0.36 0.20 0.136 0.66 0.11 0.95 0.16 0.72 0.21 2.45 4.14 6.04 0.47 1.78 1.28 0.23 2.30 0.90 2.42 0.15 0.15 0.25 0.13 0.31 0.47 0.14 2.50 0.17 0.16 0.29 5.12 0.53 1.05 0.47 0.85 0.120 0.17 0.21 0.33 0.24 0.36 0.13 0.24 0.90 0.74 1.31 0.58 0.12 0.86 4.48 0.15 0.54 0.34 0.32 1.39 5.21 0.160000 0.94 0.69 3.63 1.06 0.36 0.18 0.18 0.23 0.89 0.27 0.35 3.04 1.36 0.31 0.43 3.06 0.35 2.01 0.29 0.150 0.41 0.24 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 2020-01-04 0.0 0.0 0.0 11.380448 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 7949.995496 2.71 4.80 8.31 0.22 0.67 0.133333 0.12 3.51 0.11 9.22 0.170 0.173333 0.34 0.28 0.26 1.62 0.84 0.20 0.17 0.11 6.34 0.190 0.1225 0.39 0.32 4.81 0.185 0.85 0.14 1.72 0.21 0.17 0.1275 2.30 0.15 1.23 0.128 4.90 2.01 0.13 0.17 0.20 0.23 0.17 3.82 0.15 0.46 2.18 0.48 0.30 0.18 1.13 2.33 0.255 0.19 0.46 0.56 0.85 0.52 1.06 1.31 0.260 0.43 1.64 0.11 0.150000 3.36 1.11 0.38 0.82 0.133333 0.43 0.22 0.32 1.55 0.49 0.136 0.72 0.67 0.15 0.130 0.22 0.49 0.20 14.13 0.44 0.138 0.24 0.176667 0.95 3.80 0.21 8.22 0.119 3.71 0.11 0.19 0.143333 1.31 0.40 1.81 0.20 0.19 3.59 2.46 0.45 0.18 7.01 0.21 3.85 1.47 0.60 0.12 0.153333 1.46 0.13 0.33 0.18 0.150 0.14 0.90 0.16 0.65 0.18 0.15 0.22 1.91 0.11 3.29 0.44 0.15 0.14 0.29 0.21 0.21 0.14 0.13 0.77 4.39 0.42 0.14 3.66 0.38 0.146667 0.57 2.53 0.17 0.17 0.20 0.51 0.40 0.55 3.99 0.24 0.83 0.13 1.02 0.14 0.36 0.18 0.134 1.11 0.135 0.164 0.39 5.18 0.62 1.42 4.18 1.61 0.16 3.21 0.25 0.23 0.36 0.11 0.175 1.59 0.168 0.65 0.1175 0.68 0.76 1.01 0.69 0.170 0.13 0.11 1.02 0.133333 0.92 0.50 0.175 0.15 0.81 0.25 6.52 0.176667 0.77 0.19 0.15 0.13 0.138 0.12 0.60 0.76 0.63 0.18 0.16 1.08 1.37 0.13 0.140 0.21 1.31 0.26 1.53 22.57 5.27 0.55 0.85 2.86 0.30 0.23 0.22 1.25 0.43 7.38 0.35 1.05 1.38 1.28 0.11 0.61 0.125 0.56 0.76 0.28 0.57 2.72 0.14 0.18 3.12 0.22 0.18 0.51 0.17 0.26 0.60 0.69 0.38 4.07 0.23 0.13 0.67 0.20 1.11 0.18 0.80 0.17 0.22 0.46 0.19 1.92 0.130 2.60 0.13 2.46 0.18 1.50 0.16 0.22 0.22 0.13 0.34 0.11 0.1600 0.25 0.49 0.13 4.74 0.185 0.50 0.22 0.156667 0.25 0.53 1.48 0.46 1.01 35.48 0.50 1.37 0.150 0.14 1.81 0.62 0.130 0.54 0.80 0.74 1.72 0.11 0.81 0.14 0.15 0.140 0.21 0.23 1.81 0.93 0.18 0.11 0.18 0.13 0.61 0.18 0.15 0.46 0.23 0.30 1.44 0.142857 0.138 0.23 0.18 0.66 0.55 0.32 0.13 0.22 0.27 0.14 0.67 0.14 0.25 0.43 0.30 0.17 0.14 2.48 1.05 0.29 0.40 0.21 0.134 0.72 0.11 0.97 0.18 0.69 0.25 2.38 3.84 6.17 0.45 1.64 1.00 0.38 2.49 0.85 2.16 0.15 0.16 0.23 0.14 0.25 0.62 0.14 2.08 0.16 0.16 0.35 4.96 0.50 0.85 0.49 0.70 0.125 0.17 0.21 0.29 0.33 0.29 0.14 0.21 0.92 0.73 1.19 0.63 0.20 0.79 3.68 0.23 0.53 0.30 0.28 1.12 4.67 0.153333 1.00 0.78 3.71 1.05 0.33 0.18 0.18 0.13 0.99 0.23 0.28 3.03 1.19 0.22 0.43 3.12 0.33 2.53 0.37 0.230 0.44 0.29 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 2020-01-05 0.0 0.0 0.0 11.380448 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 6054.562803 2.90 4.99 8.58 0.23 0.65 0.146667 0.12 3.30 0.11 8.81 0.160 0.176667 0.38 0.29 0.38 1.57 0.88 0.19 0.16 0.11 6.72 0.195 0.1250 0.36 0.33 4.77 0.160 0.80 0.16 1.55 0.24 0.15 0.1250 2.48 0.15 1.21 0.124 5.11 1.80 0.14 0.17 0.33 0.18 0.17 3.99 0.23 0.62 2.40 0.41 0.29 0.25 1.24 3.42 0.340 0.38 0.46 0.64 0.93 0.69 0.93 1.60 0.245 0.35 1.79 0.11 0.250000 3.49 0.97 0.46 0.68 0.140000 0.40 0.22 0.28 1.55 0.37 0.128 0.73 0.55 0.14 0.135 0.25 0.41 0.21 13.39 0.42 0.136 0.24 0.153333 0.97 3.83 0.24 8.17 0.118 3.71 0.11 0.18 0.136667 1.40 0.40 1.77 0.29 0.19 4.02 2.45 0.41 0.18 7.33 0.21 3.50 1.63 0.47 0.12 0.145000 1.41 0.13 0.23 0.17 0.165 0.27 0.72 0.19 0.75 0.18 0.23 0.20 2.10 0.11 2.78 0.45 0.15 0.15 0.30 0.14 0.24 0.20 0.13 0.80 4.74 0.42 0.14 3.71 0.41 0.143333 0.56 2.56 0.19 0.17 0.16 0.65 0.55 0.63 3.83 0.31 1.04 0.16 1.16 0.14 0.38 0.18 0.136 1.01 0.130 0.152 0.31 4.90 0.61 1.46 4.65 1.56 0.15 3.04 0.20 0.23 0.39 0.11 0.180 1.41 0.172 0.60 0.1150 0.52 0.74 1.08 0.66 0.165 0.13 0.11 0.99 0.126667 1.00 0.43 0.150 0.14 0.95 0.25 6.84 0.193333 0.76 0.15 0.19 0.13 0.146 0.12 0.62 0.75 0.54 0.18 0.18 1.13 1.41 0.17 0.155 0.18 1.53 0.23 1.49 22.45 5.30 0.79 0.85 3.06 0.34 0.40 0.23 1.27 0.50 7.56 0.34 1.04 1.32 1.40 0.11 0.69 0.130 0.42 0.72 0.29 0.39 2.72 0.17 0.18 3.05 0.20 0.18 0.48 0.15 0.16 0.42 0.86 0.31 4.25 0.24 0.13 0.94 0.20 1.21 0.24 0.82 0.17 0.23 0.50 0.13 2.04 0.125 2.44 0.13 2.35 0.18 1.55 0.13 0.28 0.24 0.13 0.35 0.11 0.1525 0.18 0.49 0.13 4.91 0.170 0.51 0.29 0.173333 0.21 0.52 1.39 0.34 0.82 36.06 0.42 1.36 0.130 0.17 1.90 0.65 0.125 0.71 0.83 0.68 1.82 0.11 0.80 0.14 0.14 0.165 0.18 0.35 1.62 0.92 0.15 0.11 0.14 0.18 0.54 0.17 0.14 0.60 0.20 0.18 1.44 0.144286 0.142 0.23 0.20 0.71 0.61 0.34 0.13 0.14 0.25 0.14 0.60 0.13 0.20 0.51 0.43 0.18 0.13 2.86 1.07 0.25 0.25 0.22 0.132 0.66 0.11 0.98 0.19 0.69 0.28 2.25 3.84 6.10 0.44 1.84 0.98 0.33 2.52 0.99 2.34 0.15 0.14 0.19 0.14 0.26 0.53 0.14 2.15 0.15 0.16 0.31 5.25 0.47 0.92 0.53 0.80 0.130 0.23 0.21 0.36 0.28 0.18 0.15 0.18 0.91 0.68 1.29 0.60 0.16 0.90 3.93 0.23 0.55 0.16 0.23 1.25 4.88 0.146667 0.85 0.74 3.75 1.08 0.36 0.18 0.18 0.17 0.98 0.19 0.29 3.06 1.46 0.27 0.39 3.45 0.25 2.38 0.35 0.205 0.57 0.28 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description of merged file with 50 States:
JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming neighbor_Mississippi State_Name_Alabama State_Name_Alaska State_Name_Arizona State_Name_Arkansas State_Name_California State_Name_Colorado State_Name_Connecticut State_Name_Delaware State_Name_Florida State_Name_Georgia State_Name_Hawaii State_Name_Idaho State_Name_Illinois State_Name_Indiana State_Name_Iowa State_Name_Kansas State_Name_Kentucky State_Name_Louisiana State_Name_Maine State_Name_Maryland State_Name_Massachusetts State_Name_Michigan State_Name_Minnesota State_Name_Mississippi State_Name_Missouri State_Name_Montana State_Name_Nebraska State_Name_Nevada State_Name_New Hampshire State_Name_New Jersey State_Name_New Mexico State_Name_New York State_Name_North Carolina State_Name_North Dakota State_Name_Ohio State_Name_Oklahoma State_Name_Oregon State_Name_Pennsylvania State_Name_Rhode Island State_Name_South Carolina State_Name_South Dakota State_Name_Tennessee State_Name_Texas State_Name_Utah State_Name_Vermont State_Name_Virginia State_Name_Washington State_Name_West Virginia State_Name_Wisconsin State_Name_Wyoming
count 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.00000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.00000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.00000 37201.000000 37201.000000 37201.000000 37201.000000 37201.00000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.00000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000 37201.000000
mean 1642.996694 22.461412 129.596933 239.810169 966.864861 300.261378 215.281892 86737.161589 289.107101 27.279207 156.702446 5487.864721 176.667034 148.596383 1137.662282 4830.924348 0.423490 117.438872 4.515029 2952.605131 97.357118 82.490776 1.223157 136.716192 28.221853 2.509515 3565.609128 2.224520 4.211325 7.649605 0.327083 0.381364 0.109883 0.219468 3.931982 0.064682 11.162166 0.213526 0.153194 0.405745 0.269757 0.459716 2.228877 0.853727 0.214926 0.060918 0.242004 7.342641 0.263803 0.046043 0.455591 0.328220 4.825432 0.223287 1.108655 0.171113 2.260597 0.316388 0.215155 0.155737 3.380084 0.082201 1.448329 0.229761 5.082650 1.421830 0.151113 0.104212 0.265999 0.238230 0.251991 3.769222 0.163597 0.433256 2.019202 0.455902 0.393479 0.211680 0.945504 2.999670 0.219953 0.290896 0.583571 0.421845 0.677466 0.469020 0.606557 1.537552 0.308325 0.423398 1.857737 0.067650 0.177879 2.685605 0.878447 0.471364 0.811826 0.104382 0.521374 0.200481 0.261375 1.376410 0.308484 0.162013 0.925049 0.698840 0.167758 0.089006 0.242922 0.490642 0.308703 10.943745 0.615894 0.167009 0.265562 0.277838 0.826816 3.317770 0.317632 5.494946 0.059775 2.938066 0.083635 0.244811 0.100265 1.066879 0.444750 2.093487 0.280338 0.090839 4.320914 2.277043 0.370444 0.151334 7.993027 0.283856 3.860419 1.537984 0.651461 0.075846 0.162223 1.187917 0.102158 0.350593 0.214050 0.177243 0.172793 0.716377 0.200848 0.907026 0.214110 0.246937 0.253640 2.257320 0.091179 2.643434 0.493368 0.257805 0.112342 0.285428 0.177484 0.28177 0.247987 0.156521 0.757898 4.347871 0.383482 0.181122 4.242673 0.516230 0.109744 0.539432 2.069475 0.244705 0.109084 0.224942 0.440011 0.586238 0.491773 3.779818 0.166749 1.024028 0.209863 0.863294 0.108243 0.416217 0.108867 0.162508 0.956812 0.175044 0.186710 0.324563 4.445122 0.515993 1.783457 4.138789 1.998067 0.193946 2.782231 0.221360 0.387623 0.381611 0.067587 0.098642 1.402507 0.097531 0.879296 0.104956 0.508539 0.830373 1.107517 0.640277 0.185522 0.114495 0.088974 1.266390 0.106182 1.121007 0.579709 0.200445 0.210298 0.839792 0.089375 6.981148 0.355204 1.012353 0.262640 0.290627 0.139757 0.183604 0.146041 0.721842 0.641578 0.717592 0.146621 0.328005 1.221746 1.562581 0.410885 0.238816 0.203684 1.076106 0.441314 1.453992 23.547391 5.459019 1.005626 0.709681 2.986322 0.391171 0.225312 0.323016 1.556976 0.437138 7.230728 0.490662 1.403749 1.304477 1.531915 0.153611 0.667295 0.069680 0.421260 1.052838 0.201736 0.484309 2.756755 0.266084 0.265207 3.960642 0.154553 0.165344 0.591108 0.253471 0.289800 0.449997 0.675633 0.330527 4.086483 0.238928 0.186927 0.997439 0.238305 0.894380 0.303466 0.737719 0.155804 0.272075 0.517059 0.241671 2.24946 0.071053 1.886288 0.092292 2.111539 0.142397 1.445533 0.217087 0.288937 0.238081 0.173985 0.281796 0.096067 0.15003 0.379989 0.484197 0.098340 5.122530 0.13753 0.497266 0.281156 0.207329 0.166162 0.822688 1.127099 0.426365 0.779262 34.302607 0.439723 1.46434 0.163608 0.175485 1.849272 0.565777 0.103019 0.679260 0.704358 0.814511 1.788223 0.141154 0.665511 0.093213 0.092147 0.179131 0.273304 0.184713 1.319595 0.889690 0.193660 0.065331 0.203206 0.160613 0.520711 0.220946 0.260138 0.582827 0.295544 0.269961 1.647219 0.102412 0.076359 0.378657 0.232703 0.590554 0.472933 0.412738 0.074451 0.149371 0.345953 0.173269 0.538049 0.090880 0.305547 0.408080 0.353565 0.179154 0.088123 2.460581 1.067069 0.290785 0.334778 0.245754 0.106790 0.601926 0.058222 0.778080 0.219264 0.800860 0.314626 1.953237 3.895940 6.334318 0.512038 1.623754 1.176944 0.345340 2.184036 0.833432 1.887773 0.095309 0.155792 0.182519 0.104091 0.350930 0.499001 0.073749 2.678848 0.223538 0.113419 0.381012 5.401614 0.581137 0.945389 0.676490 0.917601 0.095219 0.258673 0.164246 0.253894 0.257782 0.452813 0.168641 0.297487 0.861295 0.644043 0.958067 0.606450 0.148172 1.078524 4.014653 0.232984 0.588387 0.226964 0.204251 1.307818 4.053234 0.067888 0.817711 0.543586 2.723427 0.761255 0.471089 0.188512 0.067815 0.258866 1.053671 0.060731 0.223656 2.736969 1.227934 0.293425 0.426625 3.109280 0.287278 2.292905 0.443379 0.179036 0.488471 0.368599 1327.719470 217.797828 2034.353942 861.475552 8179.667966 1303.218758 755.674283 291.037822 6278.364049 2540.742399 210.420096 433.860783 3138.071907 1751.074622 878.234026 745.770490 1206.369963 1196.553103 209.977205 1157.366254 1685.881347 2444.214994 1401.553318 1537.577269 267.204833 503.653316 671.060670 282.627564 2607.112363 486.184162 5882.125884 2392.479234 245.316094 2979.518588 971.534556 605.502540 3173.699659 350.453859 1380.724927 249.486277 1927.012822 6913.934222 898.356603 117.978092 1608.310637 1226.405419 496.411172 1562.249241 157.871912 772.581463 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.020026 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999 0.019999
std 4589.492744 58.320425 222.529119 184.199940 1860.902082 860.911626 692.092902 49879.919783 769.121134 206.544173 533.877253 7599.045473 590.841522 526.643874 1434.119878 4401.861800 11.716711 463.056179 69.617293 4111.938931 424.346806 370.925946 20.343860 484.900113 196.304827 33.331048 2799.152905 0.545928 0.518877 1.206582 0.089734 0.203964 0.067302 0.165150 0.644033 0.055037 2.049919 0.078566 0.039093 0.097349 0.081695 0.118838 0.493838 0.223363 0.083358 0.045597 0.184194 0.969800 0.102122 0.037843 0.120233 0.087179 0.981536 0.085675 0.441319 0.270638 0.571206 0.122133 0.090843 0.042588 0.818402 0.056224 0.277393 0.089012 0.629778 0.399944 0.038505 0.058974 0.211819 0.088582 0.086206 0.518650 0.044619 0.141227 0.288387 0.108156 0.109454 0.079437 0.189443 0.658813 0.094405 0.093934 0.119415 0.089742 0.141889 0.087828 0.291549 0.284760 0.080980 0.085769 0.383977 0.051675 0.052636 0.468132 0.119544 0.125282 0.128367 0.064962 0.155910 0.062259 0.100070 0.219517 0.122396 0.087060 0.183228 0.159627 0.062237 0.054594 0.085614 0.121315 0.147525 6.132593 0.115714 0.087453 0.079285 0.108800 0.197069 0.445586 0.087962 2.395301 0.045652 0.413972 0.056821 0.136911 0.059914 0.202589 0.104236 0.569267 0.077982 0.060346 0.661276 0.351888 0.099583 0.051693 1.502710 0.093963 0.529150 0.219759 0.125073 0.054346 0.050347 0.187742 0.056518 0.102488 0.083446 0.060737 0.047565 0.184353 0.083168 0.230056 0.111483 0.111505 0.130423 0.526821 0.061073 0.462681 0.096964 0.481927 0.059824 0.085290 0.045286 0.10082 0.205409 0.045594 0.164024 0.548570 0.101449 0.056185 1.745725 0.151499 0.063587 0.107715 0.348704 0.081021 0.058960 0.077896 0.089439 0.117073 0.094690 0.540364 0.049454 0.196645 0.072436 0.153480 0.057488 0.088588 0.063770 0.049911 0.176957 0.052994 0.066662 0.094550 0.694195 0.101951 0.598155 0.516099 0.434475 0.066200 0.464012 0.084227 0.144072 0.092662 0.061485 0.063644 0.188999 0.070156 0.229291 0.061377 0.095377 0.141488 0.214752 0.125095 0.066630 0.064237 0.063064 0.368522 0.061229 0.219211 0.131392 0.070460 0.095643 0.221749 0.054334 1.596895 0.167172 0.222508 0.097705 0.104193 0.041719 0.061611 0.042554 0.150322 0.149327 0.164113 0.036447 0.103147 0.233984 0.298250 0.177780 0.094049 0.082083 0.208902 0.113871 0.223313 8.411747 0.786016 0.164177 0.164057 0.414602 0.094522 0.075956 0.180643 0.298121 0.091191 1.661685 0.118604 0.362461 0.208900 0.263665 0.043619 0.123194 0.057358 0.107694 0.272461 0.062724 0.089234 0.358501 0.140193 0.183888 0.758183 0.061734 0.051479 0.112011 0.096328 0.087105 0.154127 0.179805 0.079512 0.512951 0.081385 0.100023 0.182619 0.084688 0.157096 0.103269 0.108844 0.042120 0.088367 0.108615 0.089315 0.36011 0.052141 0.732544 0.055112 0.345236 0.035824 0.192274 0.090983 0.082669 0.087921 0.054613 0.091074 0.057460 0.04196 0.123000 0.123626 0.058247 0.929657 0.03579 0.091092 0.081470 0.072335 0.046897 0.232267 0.252951 0.152700 0.200095 3.681106 0.083371 0.20772 0.042887 0.062289 0.236163 0.165099 0.060938 0.139605 0.152195 0.169442 0.376893 0.043107 0.263539 0.116521 0.058767 0.064853 0.098330 0.075136 0.560540 0.155622 0.077965 0.058938 0.096430 0.046269 0.115407 0.096079 0.111067 0.103458 0.106173 0.099549 0.278741 0.056736 0.051770 0.131030 0.081638 0.101374 0.096368 0.085382 0.054467 0.044367 0.083111 0.065222 0.101837 0.059609 0.093993 0.137756 0.086957 0.058963 0.055978 0.417083 0.158790 0.090534 0.078794 0.083219 0.057841 0.123102 0.045726 0.116799 0.101946 0.212136 0.089945 0.582688 0.606909 1.484545 0.108460 0.220600 0.266739 0.092933 0.302573 0.141953 0.646125 0.063911 0.043921 0.064398 0.061922 0.103055 0.132451 0.056323 0.527234 0.123687 0.067366 0.101521 0.747575 0.167573 0.157793 0.127647 0.191350 0.061198 0.086242 0.045334 0.083751 0.110987 0.152294 0.054106 0.085000 0.214930 0.145842 0.164854 0.117778 0.039428 0.309929 0.826865 0.086244 0.136575 0.124381 0.070935 0.228164 0.689213 0.053341 0.139305 0.106252 0.528583 0.160877 0.112833 0.076779 0.056125 0.097926 0.169899 0.053682 0.081589 0.481990 0.203958 0.087904 0.097998 0.519479 0.090348 0.555266 0.097567 0.056662 0.130634 0.091791 1777.121206 400.107915 2668.066812 1181.408242 12606.656158 1842.571852 1682.655019 474.923196 14571.299146 4288.389593 547.087449 500.766977 5717.612569 2542.248441 1549.635658 1618.267099 1667.903688 2258.386126 318.025668 2357.062860 3151.439206 5052.699771 2025.565127 2200.359874 372.512287 746.204161 871.662002 552.809509 4900.333245 746.226191 12044.831200 3907.653535 335.826951 4076.669346 1396.863726 968.288348 4538.722796 910.759634 2454.836057 371.977558 3433.185678 10186.784345 1258.928274 309.490778 2625.894387 2178.501511 747.886818 2253.383594 244.802494 1275.577884 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140092 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000 0.140000
min -9005.000000 -364.000000 0.000000 0.305869 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.570000 1.660000 3.090000 0.100000 0.050000 0.000000 0.000000 1.690000 0.000000 4.260000 0.050000 0.040000 0.130000 0.090000 0.130000 0.630000 0.240000 0.060000 0.000000 0.000000 3.550000 0.040000 0.000000 0.140000 0.100000 2.120000 0.040000 0.430000 0.040000 0.620000 0.080000 0.040000 0.040000 1.130000 0.000000 0.410000 0.070000 2.100000 0.360000 0.040000 0.000000 0.060000 0.060000 0.090000 1.460000 0.040000 0.100000 0.760000 0.120000 0.110000 0.070000 0.310000 1.160000 0.040000 0.080000 0.140000 0.110000 0.120000 0.130000 0.100000 0.570000 0.090000 0.110000 0.700000 0.000000 0.040000 0.960000 0.270000 0.110000 0.210000 0.000000 0.070000 0.050000 0.070000 0.530000 0.060000 0.040000 0.300000 0.200000 0.050000 0.000000 0.080000 0.130000 0.090000 2.400000 0.230000 0.040000 0.080000 0.060000 0.170000 1.190000 0.100000 1.290000 0.000000 1.070000 0.000000 0.030000 0.000000 0.300000 0.110000 0.690000 0.080000 0.000000 1.800000 0.890000 0.100000 0.030000 2.900000 0.060000 1.790000 0.610000 0.180000 0.000000 0.040000 0.330000 0.000000 0.080000 0.060000 0.040000 0.050000 0.210000 0.060000 0.270000 0.050000 0.050000 0.070000 0.780000 0.000000 0.950000 0.150000 0.050000 0.000000 0.100000 0.050000 0.09000 0.050000 0.040000 0.200000 1.980000 0.100000 0.050000 1.600000 0.110000 0.000000 0.160000 0.960000 0.050000 0.000000 0.040000 0.120000 0.200000 0.140000 1.760000 0.040000 0.370000 0.060000 0.240000 0.000000 0.120000 0.000000 0.040000 0.330000 0.050000 0.040000 0.110000 1.290000 0.170000 0.460000 1.720000 0.640000 0.040000 1.280000 0.070000 0.100000 0.110000 0.000000 0.000000 0.590000 0.000000 0.250000 0.000000 0.150000 0.300000 0.300000 0.230000 0.040000 0.000000 0.000000 0.160000 0.000000 0.400000 0.150000 0.030000 0.050000 0.160000 0.000000 2.270000 0.090000 0.330000 0.040000 0.070000 0.040000 0.040000 0.040000 0.210000 0.130000 0.200000 0.040000 0.060000 0.360000 0.570000 0.080000 0.060000 0.040000 0.310000 0.120000 0.560000 8.830000 2.440000 0.330000 0.150000 1.240000 0.110000 0.050000 0.060000 0.520000 0.120000 2.630000 0.120000 0.370000 0.470000 0.520000 0.050000 0.280000 0.000000 0.110000 0.290000 0.040000 0.170000 1.070000 0.040000 0.050000 1.460000 0.050000 0.040000 0.140000 0.040000 0.080000 0.100000 0.140000 0.110000 1.770000 0.080000 0.030000 0.380000 0.070000 0.280000 0.080000 0.290000 0.050000 0.080000 0.190000 0.050000 0.93000 0.000000 0.300000 0.000000 0.770000 0.040000 0.560000 0.070000 0.060000 0.050000 0.040000 0.080000 0.000000 0.04000 0.070000 0.100000 0.000000 1.650000 0.04000 0.120000 0.090000 0.040000 0.040000 0.170000 0.360000 0.100000 0.210000 16.280000 0.140000 0.65000 0.050000 0.060000 0.730000 0.140000 0.000000 0.140000 0.210000 0.210000 0.690000 0.040000 0.120000 0.000000 0.000000 0.040000 0.050000 0.050000 0.230000 0.230000 0.040000 0.000000 0.040000 0.040000 0.170000 0.040000 0.060000 0.160000 0.060000 0.070000 0.590000 0.000000 0.000000 0.070000 0.060000 0.120000 0.120000 0.130000 0.000000 0.040000 0.100000 0.050000 0.170000 0.000000 0.070000 0.080000 0.100000 0.050000 0.000000 0.840000 0.420000 0.090000 0.100000 0.100000 0.000000 0.170000 0.000000 0.280000 0.060000 0.250000 0.100000 0.430000 1.640000 2.430000 0.140000 0.660000 0.390000 0.100000 0.900000 0.280000 0.460000 0.000000 0.040000 0.040000 0.000000 0.100000 0.110000 0.000000 0.880000 0.070000 0.000000 0.130000 2.190000 0.120000 0.320000 0.210000 0.310000 0.000000 0.070000 0.040000 0.080000 0.060000 0.100000 0.040000 0.110000 0.250000 0.180000 0.360000 0.220000 0.040000 0.340000 1.330000 0.070000 0.150000 0.040000 0.040000 0.470000 1.490000 0.000000 0.270000 0.110000 0.910000 0.180000 0.130000 0.050000 0.000000 0.050000 0.400000 0.000000 0.050000 0.970000 0.470000 0.080000 0.120000 1.000000 0.090000 0.710000 0.130000 0.050000 0.140000 0.100000 0.000000 0.000000 0.000000 -400.000000 -3935.000000 -2.000000 -15.000000 -27.000000 -1448.000000 0.000000 -79.000000 -2.000000 0.000000 0.000000 -947.000000 -168.000000 -3.000000 -119.000000 -7.000000 -3.000000 -280.000000 0.000000 0.000000 -8492.000000 -1.000000 -43.000000 -51.000000 -63.000000 -9005.000000 0.000000 0.000000 0.000000 -35.000000 0.000000 -1294.000000 0.000000 0.000000 -1260.000000 -22.000000 -4.000000 -42.000000 0.000000 0.000000 -1.000000 -232.000000 -39.000000 -4.000000 0.000000 -8.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 39.000000 0.000000 16.000000 117.239542 0.000000 0.000000 0.000000 54508.215741 0.000000 0.000000 0.000000 1427.348263 0.000000 0.000000 0.000000 1653.298169 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1885.633898 1.800000 3.950000 6.890000 0.260000 0.230000 0.090000 0.090000 3.610000 0.040000 9.860000 0.170000 0.130000 0.340000 0.220000 0.390000 1.850000 0.690000 0.170000 0.040000 0.100000 6.770000 0.190000 0.000000 0.370000 0.280000 4.360000 0.170000 0.960000 0.120000 1.810000 0.250000 0.160000 0.130000 2.740000 0.060000 1.250000 0.170000 4.740000 1.090000 0.130000 0.080000 0.180000 0.180000 0.200000 3.470000 0.140000 0.310000 1.830000 0.400000 0.330000 0.170000 0.820000 2.590000 0.170000 0.230000 0.500000 0.360000 0.590000 0.420000 0.390000 1.370000 0.260000 0.370000 1.620000 0.042500 0.140000 2.380000 0.810000 0.390000 0.730000 0.070000 0.400000 0.155000 0.200000 1.250000 0.230000 0.110000 0.790000 0.580000 0.130000 0.070000 0.200000 0.410000 0.240000 7.500000 0.550000 0.125000 0.220000 0.200000 0.680000 3.080000 0.260000 3.810000 0.040000 2.710000 0.060000 0.140000 0.080000 0.940000 0.380000 1.810000 0.240000 0.070000 3.920000 2.060000 0.300000 0.120000 6.960000 0.220000 3.540000 1.410000 0.560000 0.057500 0.130000 1.080000 0.080000 0.280000 0.170000 0.140000 0.140000 0.590000 0.150000 0.760000 0.152500 0.180000 0.200000 1.850000 0.070000 2.330000 0.430000 0.140000 0.090000 0.230000 0.150000 0.22000 0.160000 0.120000 0.630000 4.030000 0.310000 0.140000 3.380000 0.400000 0.080000 0.470000 1.910000 0.200000 0.080000 0.170000 0.380000 0.510000 0.430000 3.500000 0.130000 0.890000 0.160000 0.770000 0.090000 0.360000 0.090000 0.130000 0.850000 0.140000 0.140000 0.270000 4.080000 0.450000 1.450000 3.860000 1.680000 0.150000 2.560000 0.175000 0.310000 0.320000 0.040000 0.070000 1.290000 0.070000 0.710000 0.080000 0.450000 0.750000 0.970000 0.560000 0.140000 0.090000 0.060000 0.970000 0.090000 0.970000 0.480000 0.150000 0.160000 0.690000 0.070000 5.860000 0.250000 0.850000 0.195000 0.220000 0.110000 0.140000 0.120000 0.610000 0.550000 0.600000 0.120000 0.250000 1.070000 1.340000 0.290000 0.180000 0.150000 0.940000 0.360000 1.330000 19.790000 4.940000 0.900000 0.590000 2.710000 0.320000 0.180000 0.240000 1.330000 0.380000 6.030000 0.420000 1.130000 1.170000 1.370000 0.125000 0.590000 0.040000 0.340000 0.840000 0.155000 0.430000 2.550000 0.180000 0.190000 3.370000 0.120000 0.130000 0.520000 0.190000 0.230000 0.360000 0.580000 0.290000 3.790000 0.190000 0.130000 0.870000 0.190000 0.790000 0.230000 0.670000 0.130000 0.210000 0.460000 0.180000 2.02000 0.050000 1.370000 0.080000 1.880000 0.120000 1.320000 0.170000 0.240000 0.180000 0.140000 0.230000 0.070000 0.12000 0.280000 0.400000 0.070000 4.490000 0.11750 0.430000 0.230000 0.150000 0.140000 0.630000 0.940000 0.310000 0.630000 32.380000 0.390000 1.34000 0.140000 0.140000 1.720000 0.480000 0.080000 0.580000 0.600000 0.690000 1.530000 0.110000 0.490000 0.020000 0.070000 0.140000 0.210000 0.140000 0.940000 0.800000 0.150000 0.040000 0.150000 0.130000 0.450000 0.160000 0.170000 0.520000 0.220000 0.210000 1.450000 0.080000 0.054286 0.280000 0.180000 0.530000 0.400000 0.360000 0.050000 0.120000 0.290000 0.140000 0.470000 0.070000 0.240000 0.320000 0.300000 0.140000 0.070000 2.190000 0.960000 0.230000 0.290000 0.200000 0.090000 0.530000 0.040000 0.710000 0.170000 0.690000 0.260000 1.530000 3.510000 5.240000 0.440000 1.500000 0.980000 0.290000 2.010000 0.740000 1.440000 0.060000 0.130000 0.133333 0.080000 0.280000 0.410000 0.050000 2.310000 0.170000 0.090000 0.320000 4.950000 0.460000 0.840000 0.590000 0.780000 0.070000 0.200000 0.130000 0.200000 0.180000 0.340000 0.130000 0.250000 0.760000 0.550000 0.860000 0.540000 0.120000 0.840000 3.420000 0.180000 0.490000 0.146000 0.150000 1.160000 3.600000 0.040000 0.720000 0.480000 2.350000 0.660000 0.390000 0.140000 0.040000 0.190000 0.950000 0.000000 0.180000 2.380000 1.100000 0.240000 0.360000 2.770000 0.230000 1.930000 0.380000 0.140000 0.410000 0.310000 234.000000 0.000000 374.000000 130.000000 1533.000000 222.000000 0.000000 52.000000 0.000000 329.000000 12.000000 0.000000 365.000000 363.000000 0.000000 0.000000 160.000000 0.000000 13.000000 326.000000 105.000000 0.000000 79.000000 255.000000 1.000000 4.000000 10.000000 5.000000 349.000000 2.000000 685.000000 1.000000 29.000000 442.000000 1.000000 45.000000 587.000000 0.000000 84.000000 0.000000 0.000000 1115.000000 110.000000 3.000000 197.000000 24.000000 25.000000 118.000000 0.000000 30.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
50% 444.000000 5.000000 53.000000 206.152870 0.000000 0.000000 0.000000 83547.999873 0.000000 0.000000 0.000000 3716.855123 0.000000 0.000000 848.220633 3925.995681 0.000000 0.000000 0.000000 1602.680584 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3183.087137 2.210000 4.270000 7.640000 0.320000 0.320000 0.100000 0.200000 3.860000 0.058684 11.030000 0.200000 0.150000 0.400000 0.260000 0.450000 2.220000 0.830000 0.200000 0.060000 0.210000 7.370000 0.240000 0.050000 0.440000 0.320000 4.770000 0.210000 1.080000 0.140000 2.210000 0.300000 0.200000 0.150000 3.310000 0.080000 1.440000 0.210000 5.110000 1.480000 0.150000 0.100000 0.220000 0.220000 0.240000 3.840000 0.160000 0.440000 2.030000 0.450000 0.390000 0.200000 0.950000 2.980000 0.200000 0.280000 0.570000 0.420000 0.670000 0.470000 0.540000 1.530000 0.300000 0.420000 1.800000 0.070000 0.170000 2.720000 0.880000 0.470000 0.810000 0.090000 0.530000 0.190000 0.240000 1.370000 0.280000 0.140000 0.920000 0.690000 0.160000 0.090000 0.230000 0.490000 0.280000 9.110000 0.600000 0.143333 0.260000 0.260000 0.790000 3.360000 0.310000 4.970000 0.060000 2.950000 0.080000 0.220000 0.100000 1.060000 0.440000 2.070000 0.270000 0.090000 4.300000 2.280000 0.360000 0.140000 7.980000 0.270000 3.870000 1.540000 0.650000 0.080000 0.150000 1.200000 0.100000 0.340000 0.200000 0.165000 0.170000 0.690000 0.185000 0.900000 0.190000 0.220000 0.230000 2.180000 0.090000 2.710000 0.490000 0.190000 0.120000 0.275000 0.170000 0.26000 0.200000 0.150000 0.760000 4.360000 0.380000 0.170000 3.810000 0.510000 0.100000 0.540000 2.060000 0.230000 0.100000 0.210000 0.440000 0.570000 0.490000 3.790000 0.160000 1.030000 0.200000 0.870000 0.110000 0.410000 0.100000 0.150000 0.940000 0.170000 0.170000 0.310000 4.470000 0.510000 1.680000 4.140000 1.980000 0.180000 2.780000 0.200000 0.360000 0.380000 0.060000 0.090000 1.410000 0.090000 0.880000 0.105000 0.500000 0.820000 1.100000 0.630000 0.170000 0.110000 0.080000 1.320000 0.100000 1.130000 0.580000 0.196667 0.190000 0.840000 0.080000 6.810000 0.320000 1.020000 0.260000 0.280000 0.140000 0.170000 0.140000 0.710000 0.640000 0.710000 0.140000 0.320000 1.230000 1.570000 0.370000 0.220000 0.180000 1.100000 0.430000 1.460000 21.730000 5.470000 1.000000 0.700000 3.010000 0.380000 0.210000 0.290000 1.560000 0.430000 6.820000 0.480000 1.360000 1.310000 1.530000 0.143333 0.660000 0.066667 0.410000 1.040000 0.200000 0.480000 2.770000 0.230000 0.240000 3.960000 0.140000 0.160000 0.590000 0.250000 0.270000 0.430000 0.670000 0.330000 4.110000 0.230000 0.180000 0.990000 0.220000 0.910000 0.300000 0.730000 0.150000 0.260000 0.510000 0.223333 2.25000 0.070000 1.850000 0.090000 2.150000 0.140000 1.450000 0.200000 0.280000 0.220000 0.170000 0.270000 0.090000 0.14000 0.380000 0.470000 0.090000 5.120000 0.13000 0.490000 0.270000 0.200000 0.160000 0.820000 1.090000 0.390000 0.760000 34.680000 0.440000 1.47000 0.150000 0.160000 1.850000 0.560000 0.100000 0.690000 0.690000 0.810000 1.740000 0.140000 0.610000 0.050000 0.090000 0.166667 0.260000 0.170000 1.200000 0.890000 0.180000 0.060000 0.180000 0.155000 0.510000 0.200000 0.240000 0.580000 0.280000 0.250000 1.650000 0.100000 0.070000 0.360000 0.210000 0.590000 0.470000 0.410000 0.080000 0.140000 0.340000 0.160000 0.540000 0.080000 0.290000 0.390000 0.340000 0.165000 0.090000 2.480000 1.060000 0.280000 0.330000 0.230000 0.110000 0.590000 0.060000 0.770000 0.200000 0.780000 0.300000 1.980000 3.920000 6.050000 0.500000 1.630000 1.150000 0.340000 2.180000 0.830000 1.740000 0.090000 0.150000 0.170000 0.103200 0.340000 0.490000 0.070000 2.660000 0.200000 0.110000 0.370000 5.420000 0.560000 0.950000 0.670000 0.910000 0.090000 0.250000 0.150000 0.240000 0.230000 0.440000 0.163333 0.290000 0.850000 0.620000 0.970000 0.600000 0.140000 1.070000 3.960000 0.220000 0.580000 0.200000 0.200000 1.300000 4.130000 0.070000 0.820000 0.540000 2.790000 0.770000 0.470000 0.170000 0.070000 0.240000 1.040000 0.060000 0.210000 2.820000 1.220000 0.280000 0.420000 3.130000 0.270000 2.190000 0.430000 0.170000 0.470000 0.360000 718.000000 63.000000 976.000000 509.000000 3910.000000 600.000000 167.000000 160.000000 2470.000000 1295.000000 77.000000 286.000000 1709.000000 841.000000 378.000000 116.000000 641.000000 502.000000 43.000000 734.000000 672.000000 636.000000 699.000000 862.000000 112.000000 251.000000 405.000000 66.000000 1362.000000 185.000000 2926.000000 1363.000000 116.000000 1332.000000 438.000000 282.000000 1539.000000 85.000000 764.000000 92.000000 958.000000 4042.000000 444.000000 22.000000 937.000000 603.000000 219.000000 622.000000 47.000000 357.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
75% 1507.000000 22.000000 146.000000 321.141907 1324.223106 0.000000 0.000000 116581.162052 273.518026 0.000000 0.000000 6900.089940 0.000000 0.000000 1644.403817 6967.929272 0.000000 0.000000 0.000000 3907.900066 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 4770.463805 2.630000 4.540000 8.310000 0.370000 0.500000 0.140000 0.320000 4.140000 0.080000 12.620000 0.240000 0.170000 0.460000 0.300000 0.520000 2.600000 1.020000 0.236667 0.080000 0.360000 7.920000 0.310000 0.070000 0.530000 0.360000 5.170000 0.270000 1.200000 0.170000 2.680000 0.360000 0.250000 0.180000 4.060000 0.100000 1.610000 0.260000 5.480000 1.750000 0.170000 0.130000 0.280000 0.270000 0.280000 4.130000 0.180000 0.540000 2.210000 0.500000 0.440000 0.230000 1.080000 3.380000 0.240000 0.340000 0.670000 0.470000 0.770000 0.520000 0.740000 1.670000 0.340000 0.470000 2.040000 0.080000 0.200000 2.990000 0.950000 0.550000 0.890000 0.130000 0.630000 0.230000 0.300000 1.500000 0.350000 0.180000 1.060000 0.800000 0.193333 0.110000 0.270000 0.560000 0.330000 11.880000 0.680000 0.180000 0.300000 0.340000 0.960000 3.620000 0.360000 6.650000 0.080000 3.210000 0.100000 0.310000 0.120000 1.190000 0.500000 2.320000 0.310000 0.110000 4.670000 2.500000 0.430000 0.170000 9.230000 0.330000 4.170000 1.670000 0.740000 0.100000 0.190000 1.310000 0.130000 0.400000 0.240000 0.200000 0.200000 0.820000 0.220000 1.040000 0.236667 0.290000 0.280000 2.710000 0.110000 2.980000 0.550000 0.280000 0.140000 0.320000 0.200000 0.32000 0.260000 0.190000 0.890000 4.700000 0.450000 0.203333 4.520000 0.620000 0.130000 0.610000 2.210000 0.270000 0.140000 0.260000 0.490000 0.650000 0.550000 4.110000 0.195000 1.170000 0.233333 0.960000 0.140000 0.460000 0.130000 0.182500 1.050000 0.200000 0.220000 0.360000 4.870000 0.570000 1.980000 4.430000 2.330000 0.230000 3.000000 0.240000 0.440000 0.440000 0.090000 0.120000 1.520000 0.120000 1.020000 0.130000 0.560000 0.910000 1.250000 0.720000 0.220000 0.146667 0.110000 1.560000 0.120000 1.290000 0.670000 0.240000 0.230000 0.980000 0.110000 8.410000 0.400000 1.160000 0.316667 0.340000 0.160000 0.220000 0.165000 0.830000 0.730000 0.830000 0.170000 0.400000 1.380000 1.770000 0.510000 0.275000 0.240000 1.220000 0.510000 1.590000 24.700000 5.990000 1.110000 0.820000 3.270000 0.450000 0.260000 0.370000 1.770000 0.490000 8.440000 0.540000 1.700000 1.440000 1.700000 0.170000 0.730000 0.090000 0.480000 1.250000 0.240000 0.540000 2.990000 0.300000 0.290000 4.580000 0.170000 0.200000 0.650000 0.300000 0.321613 0.520000 0.760000 0.370000 4.420000 0.260000 0.230000 1.110000 0.260000 1.000000 0.360000 0.810000 0.180000 0.310000 0.570000 0.280000 2.49000 0.090000 2.270000 0.118889 2.350000 0.157500 1.560000 0.240000 0.330000 0.270000 0.200000 0.320000 0.120000 0.17000 0.470000 0.550000 0.120000 5.810000 0.15750 0.550000 0.320000 0.250000 0.190000 1.010000 1.310000 0.512000 0.930000 36.620000 0.490000 1.59000 0.190000 0.190000 2.000000 0.640000 0.130000 0.770000 0.790000 0.930000 1.990000 0.160000 0.780000 0.120000 0.114000 0.200000 0.330000 0.205000 1.610000 0.990000 0.220000 0.080000 0.225000 0.190000 0.580000 0.260000 0.325833 0.650000 0.360000 0.310000 1.840000 0.122500 0.100000 0.470000 0.260000 0.660000 0.540000 0.460000 0.098333 0.175000 0.390000 0.190000 0.600000 0.110000 0.350000 0.480000 0.400000 0.200000 0.120000 2.730000 1.170000 0.330000 0.380000 0.270000 0.130000 0.670000 0.080000 0.850000 0.230000 0.870000 0.350000 2.310000 4.290000 7.330000 0.580000 1.750000 1.350000 0.390000 2.370000 0.920000 2.220000 0.120000 0.170000 0.210000 0.130000 0.410000 0.585000 0.100000 3.080000 0.240000 0.140000 0.430000 5.880000 0.680000 1.050000 0.760000 1.050000 0.120000 0.300000 0.190000 0.280000 0.310000 0.550000 0.200000 0.330000 0.940000 0.730000 1.070000 0.660000 0.170000 1.300000 4.700000 0.260000 0.670000 0.290000 0.250000 1.460000 4.560000 0.090000 0.910000 0.610000 3.140000 0.870000 0.540000 0.225000 0.090000 0.310000 1.150000 0.094930 0.250000 3.070000 1.360000 0.340000 0.480000 3.470000 0.330000 2.520000 0.510000 0.200000 0.550000 0.420000 1705.000000 270.000000 2970.000000 1072.000000 8865.000000 1806.000000 786.000000 365.000000 7121.000000 3165.000000 156.000000 636.000000 3371.000000 2324.000000 960.000000 520.000000 1756.000000 1253.000000 266.000000 1158.000000 2037.000000 2576.000000 1698.000000 2074.000000 371.000000 663.000000 960.000000 363.000000 3421.000000 670.000000 6668.000000 2885.000000 344.000000 4262.000000 1237.000000 815.000000 4524.000000 371.000000 1669.000000 366.000000 2279.000000 8807.000000 1274.000000 131.000000 1844.000000 1584.000000 785.000000 2319.000000 200.000000 944.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 162871.000000 2441.000000 2580.000000 1602.563385 29828.486204 24400.214066 19336.781144 387777.167531 16652.730527 8285.291661 10025.855916 160126.276081 11887.411288 11793.697781 18856.420590 57234.221808 607.512334 10401.504845 3383.641439 64258.429881 12525.948910 11341.117042 659.513327 9458.974853 5443.848683 2072.314679 30513.293976 5.090000 7.560000 20.410000 0.950000 1.430000 1.090000 1.600000 12.820000 0.470000 35.300000 1.070000 0.450000 1.170000 0.820000 4.880000 3.490000 1.620000 0.890000 0.320000 3.470000 19.350000 0.970000 0.240000 1.390000 3.660000 27.960000 0.830000 10.100000 22.100000 5.730000 2.630000 1.320000 0.570000 5.630000 0.390000 4.930000 0.980000 8.440000 2.390000 0.450000 0.380000 3.670000 0.980000 0.950000 9.960000 0.500000 1.020000 3.470000 2.470000 2.460000 0.800000 1.610000 20.680000 1.590000 1.470000 1.460000 1.240000 1.300000 1.550000 1.950000 8.090000 0.980000 1.120000 6.370000 0.300000 0.600000 4.930000 1.900000 3.340000 1.660000 0.550000 1.400000 1.550000 1.030000 5.330000 2.250000 1.480000 3.550000 1.790000 3.310000 0.630000 1.160000 4.770000 3.030000 67.920000 4.060000 4.040000 1.150000 1.590000 1.990000 4.780000 2.300000 19.370000 0.290000 6.310000 0.380000 0.970000 0.710000 2.010000 1.440000 26.160000 1.070000 0.380000 13.170000 4.140000 0.970000 0.590000 21.390000 1.490000 14.330000 3.910000 1.320000 0.480000 0.740000 2.040000 0.410000 1.370000 0.890000 0.600000 0.500000 2.500000 1.120000 3.190000 3.060000 1.400000 3.430000 6.210000 0.520000 4.490000 1.520000 17.790000 0.450000 0.940000 0.730000 1.42000 3.530000 0.450000 1.760000 6.850000 1.300000 0.660000 28.480000 1.230000 0.390000 3.570000 7.070000 1.030000 0.390000 0.830000 1.140000 1.590000 1.050000 8.920000 0.460000 1.840000 0.790000 1.750000 0.400000 1.320000 0.400000 0.670000 2.430000 0.650000 0.700000 1.330000 8.590000 2.560000 10.260000 8.250000 6.570000 0.630000 9.030000 1.050000 1.970000 1.310000 2.310000 0.500000 2.530000 0.520000 5.710000 0.390000 1.240000 3.490000 2.020000 1.400000 0.820000 0.400000 0.940000 2.570000 1.180000 2.010000 1.180000 0.580000 1.040000 2.160000 0.470000 15.740000 2.070000 3.260000 1.030000 4.980000 1.550000 0.550000 0.520000 1.620000 2.940000 1.530000 0.430000 1.060000 2.260000 4.000000 2.790000 3.150000 1.350000 1.840000 1.270000 3.350000 100.000000 10.190000 4.670000 1.330000 5.540000 1.270000 1.170000 4.620000 2.610000 1.210000 15.630000 3.650000 3.200000 2.620000 3.540000 0.560000 1.740000 0.360000 1.390000 1.950000 0.590000 1.190000 4.940000 1.310000 3.580000 7.720000 1.150000 0.480000 2.310000 1.010000 0.880000 9.150000 5.570000 1.070000 8.290000 0.880000 13.110000 2.770000 0.890000 1.790000 1.140000 1.520000 0.590000 0.820000 4.760000 1.380000 5.15000 0.310000 5.120000 0.350000 3.400000 0.440000 2.730000 1.010000 0.830000 1.020000 0.510000 1.210000 0.430000 0.53000 1.180000 1.280000 0.440000 10.850000 0.48000 1.320000 0.900000 0.620000 0.610000 1.740000 2.230000 1.560000 1.640000 47.730000 1.230000 3.00000 0.410000 0.880000 3.520000 5.430000 0.430000 1.740000 1.650000 3.080000 5.410000 1.930000 2.940000 1.610000 0.360000 0.680000 0.900000 0.860000 5.170000 1.600000 1.980000 0.380000 2.500000 0.430000 2.250000 0.970000 0.900000 1.360000 1.120000 1.100000 3.940000 0.440000 0.380000 1.010000 0.900000 1.400000 1.190000 1.110000 0.390000 0.520000 1.210000 0.700000 1.550000 0.370000 0.920000 1.670000 2.090000 0.570000 0.400000 4.760000 1.970000 1.060000 1.050000 1.090000 0.520000 2.670000 0.310000 1.790000 2.060000 2.960000 1.010000 4.100000 7.820000 25.650000 1.530000 3.670000 2.470000 1.430000 4.300000 2.270000 5.710000 0.370000 0.550000 0.680000 0.450000 1.670000 1.280000 0.370000 7.750000 2.480000 1.410000 1.100000 8.700000 2.050000 2.120000 2.170000 2.030000 0.370000 0.890000 0.410000 1.160000 0.890000 1.450000 0.650000 3.510000 4.160000 1.940000 1.520000 1.630000 0.590000 4.750000 8.290000 1.770000 1.880000 2.390000 0.560000 2.450000 6.210000 0.340000 1.770000 1.400000 4.220000 1.450000 4.780000 1.230000 0.370000 0.910000 3.170000 0.400000 1.080000 14.870000 2.810000 0.990000 1.080000 4.740000 1.290000 10.530000 2.020000 0.620000 3.820000 1.450000 12972.000000 4013.000000 17234.000000 8434.000000 133669.000000 25224.000000 23678.000000 3729.000000 150251.000000 65039.000000 4789.000000 2527.000000 89195.000000 31431.000000 11885.000000 15972.000000 22912.000000 31161.000000 2149.000000 36646.000000 33090.000000 65557.000000 16184.000000 26182.000000 2216.000000 6135.000000 12443.000000 5082.000000 38461.000000 7437.000000 132093.000000 45901.000000 2807.000000 37626.000000 10073.000000 10433.000000 33650.000000 16228.000000 35951.000000 3047.000000 41464.000000 162871.000000 14754.000000 2779.000000 40246.000000 33069.000000 9164.000000 16956.000000 1541.000000 17525.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
Data type of merged file with 50 States:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming neighbor_Mississippi State_Name_Alabama State_Name_Alaska State_Name_Arizona State_Name_Arkansas State_Name_California State_Name_Colorado State_Name_Connecticut State_Name_Delaware State_Name_Florida State_Name_Georgia State_Name_Hawaii State_Name_Idaho State_Name_Illinois State_Name_Indiana State_Name_Iowa State_Name_Kansas State_Name_Kentucky State_Name_Louisiana State_Name_Maine State_Name_Maryland State_Name_Massachusetts State_Name_Michigan State_Name_Minnesota State_Name_Mississippi State_Name_Missouri State_Name_Montana State_Name_Nebraska State_Name_Nevada State_Name_New Hampshire State_Name_New Jersey State_Name_New Mexico State_Name_New York State_Name_North Carolina State_Name_North Dakota State_Name_Ohio State_Name_Oklahoma State_Name_Oregon State_Name_Pennsylvania State_Name_Rhode Island State_Name_South Carolina State_Name_South Dakota State_Name_Tennessee State_Name_Texas State_Name_Utah State_Name_Vermont State_Name_Virginia State_Name_Washington State_Name_West Virginia State_Name_Wisconsin State_Name_Wyoming
0 datetime64[ns] float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8 uint8
Number of NAs per column of merged file with 50 States after cleaning:
date JHU_cases JHU_deaths JHU_hospitalizations up2date gt_after covid vaccine gt_side effects of vaccine gt_effects of covid vaccine gt_covid gt_how long does covid last gt_anosmia gt_loss smell gt_covid-19 gt_loss taste gt_loss of smell gt_chest pain gt_covid symptoms gt_sars-cov 2 gt_chest tightness gt_covid nhs gt_quarantine gt_covid-19 who gt_sars-cov-2 gt_feeling exhausted gt_nose bleed gt_feeling tired gt_joints aching gt_fever gt2_Abdominal obesity gt2_Abdominal pain gt2_Acne gt2_Actinic keratosis gt2_Acute bronchitis gt2_Adrenal crisis gt2_Ageusia gt2_Alcoholism gt2_Allergic conjunctivitis gt2_Allergy gt2_Amblyopia gt2_Amenorrhea gt2_Amnesia gt2_Anal fissure gt2_Anaphylaxis gt2_Anemia gt2_Angina pectoris gt2_Angioedema gt2_Angular cheilitis gt2_Anosmia gt2_Anxiety gt2_Aphasia gt2_Aphonia gt2_Apnea gt2_Arthralgia gt2_Arthritis gt2_Ascites gt2_Asperger syndrome gt2_Asphyxia gt2_Asthma gt2_Astigmatism gt2_Ataxia gt2_Atheroma gt2_Attention deficit hyperactivity disorder gt2_Auditory hallucination gt2_Autoimmune disease gt2_Avoidant personality disorder gt2_Back pain gt2_Bacterial vaginosis gt2_Balance disorder gt2_Beau's lines gt2_Bell's palsy gt2_Biliary colic gt2_Binge eating gt2_Bleeding gt2_Bleeding on probing gt2_Blepharospasm gt2_Bloating gt2_Blood in stool gt2_Blurred vision gt2_Blushing gt2_Boil gt2_Bone fracture gt2_Bone tumor gt2_Bowel obstruction gt2_Bradycardia gt2_Braxton Hicks contractions gt2_Breakthrough bleeding gt2_Breast pain gt2_Bronchitis gt2_Bruise gt2_Bruxism gt2_Bunion gt2_Burn gt2_Burning Chest Pain gt2_Burning mouth syndrome gt2_Candidiasis gt2_Canker sore gt2_Cardiac arrest gt2_Carpal tunnel syndrome gt2_Cataplexy gt2_Cataract gt2_Chancre gt2_Cheilitis gt2_Chest pain gt2_Chills gt2_Chorea gt2_Chronic pain gt2_Cirrhosis gt2_Cleft lip and cleft palate gt2_Clouding of consciousness gt2_Cluster headache gt2_Colitis gt2_Coma gt2_Common cold gt2_Compulsive behavior gt2_Compulsive hoarding gt2_Confusion gt2_Congenital heart defect gt2_Conjunctivitis gt2_Constipation gt2_Convulsion gt2_Cough gt2_Crackles gt2_Cramp gt2_Crepitus gt2_Croup gt2_Cyanosis gt2_Dandruff gt2_Delayed onset muscle soreness gt2_Dementia gt2_Dentin hypersensitivity gt2_Depersonalization gt2_Depression gt2_Dermatitis gt2_Desquamation gt2_Developmental disability gt2_Diabetes gt2_Diabetic ketoacidosis gt2_Diarrhea gt2_Dizziness gt2_Dry eye syndrome gt2_Dysautonomia gt2_Dysgeusia gt2_Dysmenorrhea gt2_Dyspareunia gt2_Dysphagia gt2_Dysphoria gt2_Dystonia gt2_Dysuria gt2_Ear pain gt2_Eczema gt2_Edema gt2_Encephalitis gt2_Encephalopathy gt2_Epidermoid cyst gt2_Epilepsy gt2_Epiphora gt2_Erectile dysfunction gt2_Erythema gt2_Erythema chronicum migrans gt2_Esophagitis gt2_Excessive daytime sleepiness gt2_Eye pain gt2_Eye strain gt2_Facial nerve paralysis gt2_Facial swelling gt2_Fasciculation gt2_Fatigue gt2_Fatty liver disease gt2_Fecal incontinence gt2_Fever gt2_Fibrillation gt2_Fibrocystic breast changes gt2_Fibromyalgia gt2_Flatulence gt2_Floater gt2_Focal seizure gt2_Folate deficiency gt2_Food craving gt2_Food intolerance gt2_Frequent urination gt2_Gastroesophageal reflux disease gt2_Gastroparesis gt2_Generalized anxiety disorder gt2_Generalized tonic–clonic seizure gt2_Genital wart gt2_Gingival recession gt2_Gingivitis gt2_Globus pharyngis gt2_Goitre gt2_Gout gt2_Grandiosity gt2_Granuloma gt2_Guilt gt2_Hair loss gt2_Halitosis gt2_Hay fever gt2_Headache gt2_Heart arrhythmia gt2_Heart murmur gt2_Heartburn gt2_Hematochezia gt2_Hematoma gt2_Hematuria gt2_Hemolysis gt2_Hemoptysis gt2_Hemorrhoids gt2_Hepatic encephalopathy gt2_Hepatitis gt2_Hepatotoxicity gt2_Hiccup gt2_Hip pain gt2_Hives gt2_Hot flash gt2_Hydrocephalus gt2_Hypercalcaemia gt2_Hypercapnia gt2_Hypercholesterolemia gt2_Hyperemesis gravidarum gt2_Hyperglycemia gt2_Hyperkalemia gt2_Hyperlipidemia gt2_Hypermobility gt2_Hyperpigmentation gt2_Hypersomnia gt2_Hypertension gt2_Hyperthermia gt2_Hyperthyroidism gt2_Hypertriglyceridemia gt2_Hypertrophy gt2_Hyperventilation gt2_Hypocalcaemia gt2_Hypochondriasis gt2_Hypoglycemia gt2_Hypogonadism gt2_Hypokalemia gt2_Hypomania gt2_Hyponatremia gt2_Hypotension gt2_Hypothyroidism gt2_Hypoxemia gt2_Hypoxia gt2_Impetigo gt2_Implantation bleeding gt2_Impulsivity gt2_Indigestion gt2_Infection gt2_Inflammation gt2_Inflammatory bowel disease gt2_Ingrown hair gt2_Insomnia gt2_Insulin resistance gt2_Intermenstrual bleeding gt2_Intracranial pressure gt2_Iron deficiency gt2_Irregular menstruation gt2_Itch gt2_Jaundice gt2_Kidney failure gt2_Kidney stone gt2_Knee Pain gt2_Kyphosis gt2_Lactose intolerance gt2_Laryngitis gt2_Leg cramps gt2_Lesion gt2_Leukorrhea gt2_Lightheadedness gt2_Low back pain gt2_Low-grade fever gt2_Lymphedema gt2_Major depressive disorder gt2_Malabsorption gt2_Male infertility gt2_Manic Disorder gt2_Melasma gt2_Melena gt2_Meningitis gt2_Menorrhagia gt2_Middle back pain gt2_Migraine gt2_Milium gt2_Mitral insufficiency gt2_Mood disorder gt2_Mood swing gt2_Morning sickness gt2_Motion sickness gt2_Mouth ulcer gt2_Muscle atrophy gt2_Muscle weakness gt2_Myalgia gt2_Mydriasis gt2_Myocardial infarction gt2_Myoclonus gt2_Nasal congestion gt2_Nasal polyp gt2_Nausea gt2_Neck mass gt2_Neck pain gt2_Neonatal jaundice gt2_Nerve injury gt2_Neuralgia gt2_Neutropenia gt2_Night sweats gt2_Night terror gt2_Nocturnal enuresis gt2_Nodule gt2_Nosebleed gt2_Nystagmus gt2_Obesity gt2_Onychorrhexis gt2_Oral candidiasis gt2_Orthostatic hypotension gt2_Osteopenia gt2_Osteophyte gt2_Osteoporosis gt2_Otitis gt2_Otitis externa gt2_Otitis media gt2_Pain gt2_Palpitations gt2_Panic attack gt2_Papule gt2_Paranoia gt2_Paresthesia gt2_Pelvic inflammatory disease gt2_Pericarditis gt2_Periodontal disease gt2_Periorbital puffiness gt2_Peripheral neuropathy gt2_Perspiration gt2_Petechia gt2_Phlegm gt2_Photodermatitis gt2_Photophobia gt2_Photopsia gt2_Pleural effusion gt2_Pleurisy gt2_Pneumonia gt2_Podalgia gt2_Polycythemia gt2_Polydipsia gt2_Polyneuropathy gt2_Polyuria gt2_Poor posture gt2_Post-nasal drip gt2_Postural orthostatic tachycardia syndrome gt2_Prediabetes gt2_Proteinuria gt2_Pruritus ani gt2_Psychosis gt2_Ptosis gt2_Pulmonary edema gt2_Pulmonary hypertension gt2_Purpura gt2_Pus gt2_Pyelonephritis gt2_Radiculopathy gt2_Rectal pain gt2_Rectal prolapse gt2_Red eye gt2_Renal colic gt2_Restless legs syndrome gt2_Rheum gt2_Rhinitis gt2_Rhinorrhea gt2_Rosacea gt2_Round ligament pain gt2_Rumination gt2_Scar gt2_Sciatica gt2_Scoliosis gt2_Seborrheic dermatitis gt2_Self-harm gt2_Sensitivity to sound gt2_Sexual dysfunction gt2_Shallow breathing gt2_Sharp pain gt2_Shivering gt2_Shortness of breath gt2_Shyness gt2_Sinusitis gt2_Skin condition gt2_Skin rash gt2_Skin tag gt2_Skin ulcer gt2_Sleep apnea gt2_Sleep deprivation gt2_Sleep disorder gt2_Snoring gt2_Sore throat gt2_Spasticity gt2_Splenomegaly gt2_Sputum gt2_Stomach rumble gt2_Strabismus gt2_Stretch marks gt2_Stridor gt2_Stroke gt2_Stuttering gt2_Subdural hematoma gt2_Suicidal ideation gt2_Swelling gt2_Swollen feet gt2_Swollen lymph nodes gt2_Syncope gt2_Tachycardia gt2_Tachypnea gt2_Telangiectasia gt2_Tenderness gt2_Testicular pain gt2_Throat irritation gt2_Thrombocytopenia gt2_Thyroid nodule gt2_Tic gt2_Tinnitus gt2_Tonsillitis gt2_Toothache gt2_Tremor gt2_Trichoptilosis gt2_Tumor gt2_Type 2 diabetes gt2_Unconsciousness gt2_Underweight gt2_Upper respiratory tract infection gt2_Urethritis gt2_Urinary incontinence gt2_Urinary tract infection gt2_Urinary urgency gt2_Uterine contraction gt2_Vaginal bleeding gt2_Vaginal discharge gt2_Vaginitis gt2_Varicose veins gt2_Vasculitis gt2_Ventricular fibrillation gt2_Ventricular tachycardia gt2_Vertigo gt2_Viral pneumonia gt2_Visual acuity gt2_Vomiting gt2_Wart gt2_Water retention gt2_Weakness gt2_Weight gain gt2_Wheeze gt2_Xeroderma gt2_Xerostomia gt2_Yawn gt2_hyperhidrosis gt2_pancreatitis neighbor_Alabama neighbor_Alaska neighbor_Arizona neighbor_Arkansas neighbor_California neighbor_Colorado neighbor_Connecticut neighbor_Delaware neighbor_Florida neighbor_Georgia neighbor_Hawaii neighbor_Idaho neighbor_Illinois neighbor_Indiana neighbor_Iowa neighbor_Kansas neighbor_Kentucky neighbor_Louisiana neighbor_Maine neighbor_Maryland neighbor_Massachusetts neighbor_Michigan neighbor_Minnesota neighbor_Missouri neighbor_Montana neighbor_Nebraska neighbor_Nevada neighbor_New Hampshire neighbor_New Jersey neighbor_New Mexico neighbor_New York neighbor_North Carolina neighbor_North Dakota neighbor_Ohio neighbor_Oklahoma neighbor_Oregon neighbor_Pennsylvania neighbor_Rhode Island neighbor_South Carolina neighbor_South Dakota neighbor_Tennessee neighbor_Texas neighbor_Utah neighbor_Vermont neighbor_Virginia neighbor_Washington neighbor_West Virginia neighbor_Wisconsin neighbor_Wyoming neighbor_Mississippi State_Name_Alabama State_Name_Alaska State_Name_Arizona State_Name_Arkansas State_Name_California State_Name_Colorado State_Name_Connecticut State_Name_Delaware State_Name_Florida State_Name_Georgia State_Name_Hawaii State_Name_Idaho State_Name_Illinois State_Name_Indiana State_Name_Iowa State_Name_Kansas State_Name_Kentucky State_Name_Louisiana State_Name_Maine State_Name_Maryland State_Name_Massachusetts State_Name_Michigan State_Name_Minnesota State_Name_Mississippi State_Name_Missouri State_Name_Montana State_Name_Nebraska State_Name_Nevada State_Name_New Hampshire State_Name_New Jersey State_Name_New Mexico State_Name_New York State_Name_North Carolina State_Name_North Dakota State_Name_Ohio State_Name_Oklahoma State_Name_Oregon State_Name_Pennsylvania State_Name_Rhode Island State_Name_South Carolina State_Name_South Dakota State_Name_Tennessee State_Name_Texas State_Name_Utah State_Name_Vermont State_Name_Virginia State_Name_Washington State_Name_West Virginia State_Name_Wisconsin State_Name_Wyoming
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [ ]:
##General variables
# List variables
var_inc=list(df_f.columns)
del var_inc[0] #delete date as variable

# List state names
state_names=["State_Name_"+s for s in file_names]

# Sort file names
#file_names.sort()


##Function for non-normalized dataframe of the selected state

def sel_state(state):    
    # Index of selected state
    sel_state=file_names.index(state)

    # Dataset selected state
    df_nnormal=df_f.copy()
    df_state=df_nnormal[df_nnormal[state_names[sel_state]]==1]
     
    return sel_state, df_state

## Run function for getting datasets of the selected state
info_sel_state = sel_state("New York")

## Print shape of file
print("Shape of non-normalized file of selected state:", info_sel_state[1].shape)
Shape of non-normalized file of selected state: (744, 550)

4.4. Train-test split and normalization (per State) ¶

Return to contents

Before starting the modelling, we created a function to obtain the data per State. This allowed us to easily run models for one State (New York) and then switch to other State (California) without unnecesarily repeating chunks of code.

We also created a function for:

  • splitting the data in train and test (as it is time series data, we considered the first 75% of the data ordered by date as train data, and the last 25% as test data),
  • normalizing the train and test data for the predictors.

This avoided repeating this chunck of code before each model.

We did not include in the function the normalization of y, as we needed to apply invert_transform to return the y data to its original state for the graphs.

In [ ]:
def train_test(state):    
    # Dataset selected state
    info_sel_state=sel_state(state)
    df_state=info_sel_state[1]
    

    # Exclude state name and date columns
    df_state.drop(state_names, axis=1, inplace=True)
    df_date=df_state["date"]
    df_state.drop("date", axis=1, inplace=True)
    col_names=list(df_state.columns)
    del col_names[0]

    #Exclude y
    df_y=df_state[["JHU_cases"]]
    df_state.drop("JHU_cases", axis=1, inplace=True)

    #Create train data (75%) and test data (25%)
    data_75=len(df_state)*3//4
    xtrain=df_state[:data_75]
    ytrain=df_y[:data_75]

    xtest=df_state[data_75:]
    ytest=df_y[data_75:]

    # Normalize xtrain dataframe
    scaler_x = MinMaxScaler(feature_range=(0, 1))
    scaled_xtrain = scaler_x.fit_transform(xtrain)
    scaled_df_xtrain = pd.DataFrame(scaled_xtrain)
    scaled_df_xtrain.columns=col_names

    #Normalized xtest dataframe
    scaled_xtest=scaler_x.transform(xtest)
    scaled_df_xtest = pd.DataFrame(scaled_xtest)
    scaled_df_xtest.columns=col_names

    # Add date column to normalized datasets
    df_date_xtrain=df_date[:data_75]
    df_normal_xtrain=pd.merge(scaled_df_xtrain.reset_index(drop=True), df_date_xtrain.reset_index(drop=True),left_index=True, right_index=True)
    
    df_date_xtest=df_date[data_75:]
    df_normal_xtest=pd.merge(scaled_df_xtest.reset_index(drop=True), df_date_xtest.reset_index(drop=True),left_index=True, right_index=True)

    # Add y column to normalized datasets
    df_normal_xtrain=pd.merge(ytrain.reset_index(drop=True),df_normal_xtrain.reset_index(drop=True), left_index=True, right_index=True)
    df_normal_xtrain['date'] = pd.to_datetime(df_normal_xtrain['date'], dayfirst=True)

    df_normal_xtest=pd.merge(ytest.reset_index(drop=True),df_normal_xtest.reset_index(drop=True), left_index=True, right_index=True)
    df_normal_xtest['date'] = pd.to_datetime(df_normal_xtest['date'], dayfirst=True)

    #Reshape x data
    x_train=np.array(scaled_df_xtrain)
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

    x_test=np.array(scaled_df_xtest)
    x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

    return x_train, ytrain["JHU_cases"], x_test, ytest["JHU_cases"], df_normal_xtrain, df_normal_xtest

## Run function for getting datasets of the selected state
info_train_test = train_test("New York")

## Print shape of files
print("Example of dataset created for one selected state (New York):\n")
print("Shape of x_train of selected state:", info_train_test[0].shape)
print("Shape of y_train of selected state:", info_train_test[1].shape)
print("Shape of x_test of selected state:", info_train_test[2].shape)
print("Shape of y_test of selected state:", info_train_test[3].shape)
Example of dataset created for one selected state (New York):

Shape of x_train of selected state: (558, 498, 1)
Shape of y_train of selected state: (558,)
Shape of x_test of selected state: (186, 498, 1)
Shape of y_test of selected state: (186,)

4.5. Time series baseline models¶

Return to contents

4.5.1 New York¶

Return to contents

a) Simple Moving Average Models¶

Return to contents

Below is the Simple Moving Average smoothing for the NY State as a baseline model. Simple Moving Average (SMA) uses a sliding window to take the average over a set number of time periods. It is an equally weighted mean of the previous n data [1]

In [ ]:
#Selected state NY and call variables
info_sel_state = sel_state("New York")
info_train_test = train_test("New York")
ny_df=info_train_test[4] # This is a DF of xtrain with the information pertaining to only the state of NY
ytrain=info_train_test[1]

#Simple moving average for JHU Cases (index 0) for a size of 3 and create a new column to store it
ny_df['pandas_SMA_3'] = ny_df.iloc[:,0].rolling(window=3).mean() #Note 0 is the column for JHU_cases    

#Simple moving average for a size of 20 and create a new column to store it
#Using the clean implementation
ny_df['pandas_SMA_20'] = ny_df.iloc[:,0].rolling(window=20).mean()

#Doing the plots
plt.figure(figsize=(12,6))
plt.title("Simple Moving Averages to predict JHU Covid Cases in New York")
plt.plot(ny_df['date'],ny_df['JHU_cases'],label='Covid Cases')
plt.plot(ny_df['date'],ny_df['pandas_SMA_3'],label='SMA 3 Days')
plt.plot(ny_df['date'],ny_df['pandas_SMA_20'],label='SMA 20 Days')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()

Another approach is Cumulative Moving Average (CMA): unlike simple moving average which drops the oldest observation as the new one gets added, cumulative moving average considers all prior observations [2].

In [ ]:
# Cumulative Moving Average for JHU Cases (index 0) for 4 min periods and create a new column to store it
ny_df['CMA_4'] = ny_df['JHU_cases'].expanding(min_periods=4).mean()

plt.figure(figsize=(12,6))
plt.title("Cumulative Moving Average to predict JHU Covid Cases in New York")
plt.plot(ny_df['date'],ny_df['JHU_cases'],label='Covid Cases')
plt.plot(ny_df['date'],ny_df['CMA_4'],label='CMA with 4 periods')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()

Exponential Moving Average (EMA): unlike SMA and CMA, exponential moving average gives more weight to the recent values and as a result of which, it can be a better model or better capture the movement of trends in a faster way. EMA's reaction is directly proportional to the pattern of the data. Since EMAs give a higher weight on recent data than on older data, they are more responsive to the latest changes as compared to SMAs, which makes the results from EMAs more timely and hence EMA is preferred over other techniques [3].

In [ ]:
ny_df['EMA'] = ny_df.iloc[:,0].ewm(span=40,adjust=False).mean()

plt.figure(figsize=(12,6))
plt.title("Exponential Moving Average to predict JHU Covid Cases in New York")
plt.plot(ny_df['date'],ny_df['JHU_cases'],label='Covid Cases')
plt.plot(ny_df['date'],ny_df['EMA'],label='EMA')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()

The way to determine the best moving average model is empirical and no formal method is available. In our case, the Moving Average models offer little prediction ability and do not take advantage of any predictors.

We will use MAPE and MSE to compare the different models for the State of New York.

1) Mean Absolute Percentage Error (MAPE): expresses accuracy as a percentage of the error. Because this number is a percentage, it can be easier to understand than the other statistics. For example, if the MAPE is 5, on average, the forecast is off by 5%.

2) Mean Squared Error (MSE): measures the average of error squares i.e. the average squared difference between the estimated values and true value.

These statistics are not very informative by themselves, but we can use them to compare the fits obtained by using different methods. For both measures, smaller values usually indicate a better fitting model [4].

In [ ]:
def return_metrics(input_df):
    input_df = input_df.fillna(0) #replace nan that appeared in the first poisitions of SMA
    
    MAPE_SMA3=mean_absolute_error(input_df['JHU_cases'].tolist(), input_df['pandas_SMA_3'].tolist())
    MAPE_SMA20=mean_absolute_error(input_df['JHU_cases'].tolist(), input_df['pandas_SMA_20'].tolist())
    MAPE_CMA4=mean_absolute_error(input_df['JHU_cases'].tolist(), input_df['CMA_4'].tolist())
    MAPE_EMA=mean_absolute_error(input_df['JHU_cases'].tolist(), input_df['EMA'].tolist())
    print("-------------------------------------")
    print("Mean Absolute Error (MAE)")
    print("For Simple Moving Average (SMA) size of 3:",MAPE_SMA3)
    print("For Simple Moving Average (SMA) size of 20:",MAPE_SMA20)
    print("For Cumulative Moving Average (CMA) with 4 minimum periods:",MAPE_CMA4)
    print("For Exponential Moving Average (EMA):",MAPE_EMA)
        
    MSE_SMA3=mean_squared_error(input_df['JHU_cases'].tolist(), input_df['pandas_SMA_3'].tolist())
    MSE_SMA20=mean_squared_error(input_df['JHU_cases'].tolist(), input_df['pandas_SMA_20'].tolist())
    MSE_CMA4=mean_squared_error(input_df['JHU_cases'].tolist(), input_df['CMA_4'].tolist())
    MSE_EMA=mean_squared_error(input_df['JHU_cases'].tolist(), input_df['EMA'].tolist())
    print("-------------------------------------")
    print("Mean Squared Error (MSE)")
    print("For Simple Moving Average (SMA) size of 3:",MSE_SMA3)
    print("For Simple Moving Average (SMA) size of 20:",MSE_SMA20)
    print("For Cumulative Moving Average (CMA) with 4 minimum periods:",MSE_CMA4)
    print("For Exponential Moving Average (EMA):",MSE_EMA)
    print("-------------------------------------")
        
    return MAPE_SMA3, MAPE_SMA20, MAPE_CMA4, MAPE_EMA

print("Calculating performance of Moving Averages for Covid Cases in New York")
metrics_ny=return_metrics(ny_df)
# Note that according to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_percentage_error.html
# the value of MAPE when some element of the y_true is zero is arbitrarily high because
# of the division by epsilon
Calculating performance of Moving Averages for Covid Cases in New York
-------------------------------------
Mean Absolute Error (MAE)
For Simple Moving Average (SMA) size of 3: 456.59259259259255
For Simple Moving Average (SMA) size of 20: 1035.6623655913977
For Cumulative Moving Average (CMA) with 4 minimum periods: 2907.3488355640297
For Exponential Moving Average (EMA): 1430.9678843660217
-------------------------------------
Mean Squared Error (MSE)
For Simple Moving Average (SMA) size of 3: 1126065.548387097
For Simple Moving Average (SMA) size of 20: 3538284.6018458786
For Cumulative Moving Average (CMA) with 4 minimum periods: 19145351.075307716
For Exponential Moving Average (EMA): 5111197.247293777
-------------------------------------

According to MSE, SMA with a size of 3 samples offers the best MSE.

b) Auto Regressive Moving Average (ARMA) Model¶

Return to contents

In [ ]:
# Augmented Dickey Fuller test (ADF Test) is a common statistical test 
# used to test whether a given Time series is stationary or not. 
# Please refer to reference (5) in the references section for further detail

# Using info New York
info_sel_state = sel_state("New York")
info_train_test=train_test("New York")

# scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# normalizing ytrain
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytrain = ytrain.reshape(1,-1)
ytrain = pd.Series(ytrain[0])


ADF = adfuller(ytrain)   
print(f'ADF Statistic: {round(ADF[0],3)}') 
print(f'p-value: {ADF[1]:.3e}')
ADF Statistic: -2.414
p-value: 1.379e-01

"If the ADF statistic is a large negative number and the p-value is smaller than 0.05, then our series is stationary."

In our case the p-value is greater than 0.05 therefore the data is not stationary and we need to apply transformation.

In order to convert the data to stationary we need to apply transformation, we will be using numpy and will try different differentiations until we get a stationary data.

In [ ]:
# Differentiating ytrain to the first order by using np.diff function
ytrain_diff = np.diff(ytrain, n=1)

ADF_diff = adfuller(ytrain_diff)   
print(f'ADF Statistic: {round(ADF_diff[0],3)}') 
print(f'p-value: {ADF_diff[1]:.3e}')
ADF Statistic: -6.031
p-value: 1.418e-07

Based on the results above we are seeing ADF p-value very close to 0. Therefore we can say that the data is stationary.

In [ ]:
plot_acf(ytrain_diff, lags=40);
plt.tight_layout()
plt.xlabel("Lags")
plt.ylabel("Autocorrelation")
Out[ ]:
Text(18.75, 0.5, 'Autocorrelation')

The graph above shows that lag 1 is the most significant and lags become insignificant compared to lag 1 thereafter. However, we see that certain lags are significant. We understand that detection and infection of covid have a certain lag (multiple days must pass to confirm that a patient has been infected after exposure). Confirmed infection rates on day 1 will be correlated with confirmed infection rates multiple days later.

We can treat the lags as hyperparameters and we will be modeling different orders of ARMA models to see which hyperparameter is performing the best.

In [ ]:
#############
# ARMA VALIDATION PART
#############
# Please refer to references (6) and (7) in the reference section
# for further details regarding the code used in this section


###
#xtrain
###
xtrain=info_train_test[4]
xtrain_copy=xtrain.loc[:, ~xtrain.columns.isin(['date', 'JHU_cases'])]

model_ma_results=[]
models_ma=[]
rmse=[]
# we are specifying the hyper parameters of the MA model below
# we will use 1st, 3rd and 7th lag in the model
rmse_list = [1,3,7]

# Time series cross-validation 

tss = TimeSeriesSplit(n_splits = 4, gap=0)
# loop for hyperparameters
for i in rmse_list:
  # loop for cross validation
  for train_index, val_index in tss.split(ytrain):
      train_y, val_y = ytrain.iloc[train_index], ytrain.iloc[val_index]
      
      # we will also be using the first order AR in this model 
      model_ma = ARMA(train_y, order=(1, i)).fit(disp=False)
      predictions = model_ma.predict(val_y.index.values[0], val_y.index.values[-1])
      true_values = val_y.values
      # appending the rmse to the list
      # we will pick the lag with the smallest rmse 
      rmse.append(math.sqrt(mean_squared_error(true_values, predictions)))

  print(f"RMSE for order {i} is : ", round(np.mean(rmse),4))
RMSE for order 1 is :  0.1367
RMSE for order 3 is :  0.1331
RMSE for order 7 is :  0.1325

We do not observe a significant difference in the RMSE with different orders in the moving average. With test dataset below we decided to pick the order (7) with the smallest RMSE

In [ ]:
# ARMA TEST PART

# normalizing ytest
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)
ytest = ytest.reshape(1,-1)
ytest = pd.Series(ytest[0])

# using the 1st order AR and 7th order MA for test
model_ma = ARMA(train_y, order=(1, 7)).fit(disp=False)

ytest = ytest.reset_index(drop=True)
predictions = model_ma.predict(ytest.index.values[0], ytest.index.values[-1])

true_values = ytest.values
print("The MSE is : " , round(math.sqrt(mean_squared_error(true_values, predictions)),4))
MAE_0_ny=mean_absolute_error(true_values, predictions)
print("The MAE is : " , round(MAE_0_ny,4))
The MSE is :  0.2144
The MAE is :  0.1295
In [ ]:
# The predictions below are the predictions for order 7
# We reshape the predictions then 
# inverse transform and plot the predictions vs. actuals 
predictions = predictions.values
predictions = predictions.reshape(-1, 1)
list_ARMA = scaler_y.inverse_transform(predictions).reshape(1,-1)[0]
#list_ARMA = predictions.tolist()

ytest  = info_train_test[3]

#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)

plt.figure(figsize=(12,6))
plt.plot(range(len(ytest)), list_ARMA, label="Predicted")
plt.plot(range(len(ytest)),list(ytest), label="Actual")
plt.title(f"AR (1) MA (7) Predictions JHU Covid Cases for {file_names[info_sel_state[0]]}")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

Baseline model for New York fails to predict the Covid cases accurately. We will be moving onto RMSE in the next sections and we will be comparing the other models with this baseline model.

4.5.2 California¶

Return to contents

a) Simple Moving Average Models¶

Return to contents

In [ ]:
#Selected state California and call variables
info_sel_state = sel_state("California")
info_train_test = train_test("California")
ca_df=info_train_test[4] # This is a DF of xtrain with the information pertaining to only the state of NY
ytrain=info_train_test[1]

#Simple moving average for JHU Cases (index 0) for a size of 3 and create a new column to store it
ca_df['pandas_SMA_3'] = ca_df.iloc[:,0].rolling(window=3).mean() #Note 0 is the column for JHU_cases    

#Simple moving average for a size of 20 and create a new column to store it
#Using the clean implementation
ca_df['pandas_SMA_20'] = ca_df.iloc[:,0].rolling(window=20).mean()

#Doing the plots
plt.figure(figsize=(12,6))
plt.title("Simple Moving Averages to predict JHU Covid Cases in California")
plt.plot(ca_df['date'],ca_df['JHU_cases'],label='Covid Cases')
plt.plot(ca_df['date'],ca_df['pandas_SMA_3'],label='SMA 3 Days')
plt.plot(ca_df['date'],ca_df['pandas_SMA_20'],label='SMA 20 Days')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()
In [ ]:
# Cumulative Moving Average for JHU Cases (index 0) for 4 min periods and create a new column to store it
ca_df['CMA_4'] = ca_df['JHU_cases'].expanding(min_periods=4).mean()

plt.figure(figsize=(12,6))
plt.title("Cumulative Moving Average to predict JHU Covid Cases in California")
plt.plot(ca_df['date'],ca_df['JHU_cases'],label='Covid Cases')
plt.plot(ca_df['date'],ca_df['CMA_4'],label='CMA with 4 periods')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()
In [ ]:
ca_df['EMA'] = ca_df.iloc[:,0].ewm(span=40,adjust=False).mean()

plt.figure(figsize=(12,6))
plt.title("Exponential Moving Average to predict JHU Covid Cases in California")
plt.plot(ca_df['date'],ca_df['JHU_cases'],label='Covid Cases')
plt.plot(ca_df['date'],ca_df['EMA'],label='EMA')
plt.xlabel("Date")
plt.ylabel("JHU cases")
plt.legend(loc=2)
plt.show()
In [ ]:
print("Calculating performance of Moving Averages for Covid Cases in California")
metrics_cal=return_metrics(ca_df)

# Note that according to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_percentage_error.html
# the value of MAPE when some element of the y_true is zero is arbitrarily high because
# of the division by epsilon
Calculating performance of Moving Averages for Covid Cases in California
-------------------------------------
Mean Absolute Error (MAE)
For Simple Moving Average (SMA) size of 3: 920.7909199522103
For Simple Moving Average (SMA) size of 20: 2080.134677419355
For Cumulative Moving Average (CMA) with 4 minimum periods: 6010.752394787415
For Exponential Moving Average (EMA): 3058.7360460414293
-------------------------------------
Mean Squared Error (MSE)
For Simple Moving Average (SMA) size of 3: 4123623.7256073286
For Simple Moving Average (SMA) size of 20: 17972695.588239245
For Cumulative Moving Average (CMA) with 4 minimum periods: 117571636.7037175
For Exponential Moving Average (EMA): 31312807.78724574
-------------------------------------

Again MSE for SMA size 3 offers the best values. Unfortunately, as described before, these models offer limited predicitive ability.

b) Auto Regressive Moving Average (ARMA) Model¶

Return to contents

In [ ]:
# Augmented Dickey Fuller test (ADF Test) is a common statistical test 
# used to test whether a given Time series is stationary or not. 
# Please refer to reference (5) in the references section for further detail

# ytrain

# Using info New York
info_sel_state = sel_state("California")
info_train_test=train_test("California")

#ytrain=info_train_test[1]

# scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# normalizing ytrain
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytrain = ytrain.reshape(1,-1)
ytrain = pd.Series(ytrain[0])

ADF = adfuller(ytrain)   
print(f'ADF Statistic: {round(ADF[0],3)}') 
print(f'p-value: {ADF[1]:.3e}')
ADF Statistic: -2.732
p-value: 6.863e-02

"If the ADF statistic is a large negative number and the p-value is smaller than 0.05, then our series is stationary."

In our case the p-value is greater than 0.05 therefore the data is not stationary and we need to apply transformation.

In order to convert the data to stationary we need to apply transformation, we will be using numpy and will try different differentiations until we get a stationary data.

In [ ]:
# Differentiating ytrain to the first order by using np.diff function
ytrain_diff = np.diff(ytrain, n=1)

ADF_diff = adfuller(ytrain_diff)   
print(f'ADF Statistic: {round(ADF_diff[0],3)}') 
print(f'p-value: {ADF_diff[1]:.3e}')
ADF Statistic: -4.343
p-value: 3.741e-04

Based on the results above we are seeing ADF p-value very close to 0. Therefore we can say that the data is stationary.

In [ ]:
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(ytrain_diff, lags=40);
plt.tight_layout()
plt.xlabel("Lags")
plt.ylabel("Autocorrelation")
Out[ ]:
Text(18.75, 0.5, 'Autocorrelation')

The graph above shows that lag 1 is the most significant and lags become insignificant compared to lag 1 thereafter. However, we see that certain lags are significant. We understand that detection and infection of covid have a certain lag (multiple days must pass to confirm that a patient has been infected after exposure). Confirmed infection rates on day 1 will be correlated with confirmed infection rates multiple days later.

We can treat the lags as hyperparameters and we will be modeling different orders of ARMA models to see which hyperparameter is performing the best.

In [ ]:
#############
# ARMA VALIDATION PART
#############
# Please refer to references (6) and (7) in the reference section
# for further details regarding the code used in this section


###
#xtrain
###
xtrain=info_train_test[4]
xtrain_copy=xtrain.loc[:, ~xtrain.columns.isin(['date', 'JHU_cases'])]

from statsmodels.tsa.arima_model import ARMA
from sklearn.metrics import mean_squared_error
import math

model_ma_results=[]
models_ma=[]
rmse=[]
# we are specifying the hyper parameters of the MA model below
# we will use 1st, 3rd and 7th lag in the model
rmse_list = [1,4,5,6,7]

# Time series cross-validation 

tss = TimeSeriesSplit(n_splits = 3, gap=0)
# loop for hyperparameters
for i in rmse_list:
  # loop for cross validation
  for train_index, val_index in tss.split(ytrain):
      train_y, val_y = ytrain.iloc[train_index], ytrain.iloc[val_index]
      
      # we will be using AR(0) in this model 
      model_ma = ARMA(train_y, order=(0, i)).fit(disp=False)
      predictions = model_ma.predict(val_y.index.values[0], val_y.index.values[-1])
      true_values = val_y.values
      # appending the rmse to the list
      # we will pick the lag with the smallest rmse 
      rmse.append(math.sqrt(mean_squared_error(true_values, predictions)))

  print(f"RMSE for order {i} is : ", round(np.mean(rmse),4))
RMSE for order 1 is :  0.188
RMSE for order 4 is :  0.1879
RMSE for order 5 is :  0.1878
RMSE for order 6 is :  0.1878
RMSE for order 7 is :  0.1877

We do not observe a significant difference in the RMSE with different orders in the moving average. With test dataset below we decided to pick the order (7) with the smallest RMSE

In [ ]:
# ARMA TEST PART

# normalizing ytest
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)
ytest = ytest.reshape(1,-1)
ytest = pd.Series(ytest[0])

# using the 0th order AR and 7th order MA for test
model_ma = ARMA(train_y, order=(0, 7)).fit(disp=False)

ytest = ytest.reset_index(drop=True)
predictions = model_ma.predict(ytest.index.values[0], ytest.index.values[-1])

true_values = ytest.values
print("The MSE is : " , round(math.sqrt(mean_squared_error(true_values, predictions)),4))
MAE_0_cal=mean_absolute_error(true_values, predictions)
print("The MAE is : " , round(MAE_0_cal,4))
The MSE is :  0.1111
The MAE is :  0.0658
In [ ]:
# The predictions below are the predictions for order 7
# We reshape the predictions then 
# inverse transform and plot the predictions vs. actuals 
predictions = predictions.values
predictions = predictions.reshape(-1, 1)
list_ARMA = scaler_y.inverse_transform(predictions).reshape(1,-1)[0]
#list_ARMA = predictions.tolist()

ytest  = info_train_test[3]

#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)

plt.figure(figsize=(12,6))
plt.plot(range(len(ytest)), list_ARMA, label="Predicted")
plt.plot(range(len(ytest)),list(ytest), label="Actual")
plt.title(f"AR (0) MA (7) Predictions JHU Covid Cases for {file_names[info_sel_state[0]]}")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

Baseline model for New York fails to predict the Covid cases accurately. We will be moving onto RMSE in the next sections and we will be comparing the other models with this baseline model.

4.6. RNN models¶

Return to contents

4.6.1 New York¶

Return to contents

a) One-layer RNN model¶

Return to contents

We will be modeling a RNN model for the Covid cases. In the first part of the code we will scale the y-variable (cases) to fit in range 0 and 1.

Then we will be using Keras built in function Timeseriesgenerator to create our train and test datasets.

Our assumption is that Covid cases have a 14 day lag. The code below will predict as such, if x is from 0 to 14 days it will predict the cases on day 15 (y variable). If x is from 1 to 15 days it will predict the cases on day 16.

In [ ]:
tf.keras.backend.clear_session()

# Using info New York
info_sel_state = sel_state("New York")
info_train_test=train_test("New York")

# Scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# minmax scaler for y, scaling between 0 and 1
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)

#Call x variables
xtrain=info_train_test[4]
xtrain.drop(["date", "JHU_cases"], axis=1, inplace=True)
xtest=info_train_test[5].copy()
xtest.drop(["date", "JHU_cases"], axis=1, inplace=True)

# time series generator uses 14 day block to predict the next day 
train_gen = TimeseriesGenerator(xtrain.to_numpy(), ytrain,
                               length=14, sampling_rate=1,  
                                batch_size = 558)

test_gen = TimeseriesGenerator(xtest.to_numpy(), ytest,
                                length=14, sampling_rate=1, 
                               batch_size = 186)

# Below we are creating x and y train and test variable from the generators
x_train, y_train = train_gen[0]
# x_val, y_val = val_gen[0]
x_test, y_test = test_gen[0]

print('x train shape is :', x_train.shape)
print('y train shape is :', y_train.shape)
print('x test shape is :' , x_test.shape)
print('y test shape is :' , y_test.shape)
x train shape is : (544, 14, 498)
y train shape is : (544, 1)
x test shape is : (172, 14, 498)
y test shape is : (172, 1)

The x train shape above is 544, 14, 498. 544 is the number of batches that we can get if we use a 14 day window. And 498 is the number of predictive features (columns) in the original dataset.

Following below we are building the RNN model. The input to our RNN model is the moving window days (14) and the features (498).

In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 100

#Create model
model_rnn_input = tf.keras.Input(shape=input_dim)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units, return_sequences=False)(model_rnn_input)
model_rnn_output = tf.keras.layers.Dense(units=1, activation='linear')(model_rnn_hidden)
model_rnn = tf.keras.Model(inputs=model_rnn_input, outputs=model_rnn_output, name="model_rnn")

#Print the model architecture
print(model_rnn.summary())

#Compile model
model_rnn.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_rnn.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_rnn"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 simple_rnn (SimpleRNN)      (None, 100)               59900     
                                                                 
 dense (Dense)               (None, 1)                 101       
                                                                 
=================================================================
Total params: 60,001
Trainable params: 60,001
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 4s 5ms/step - loss: 0.0375 - mae: 0.1197 - val_loss: 0.0363 - val_mae: 0.1828
Epoch 2/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0071 - mae: 0.0597 - val_loss: 0.0198 - val_mae: 0.1352
Epoch 3/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0070 - mae: 0.0598 - val_loss: 0.0097 - val_mae: 0.0935
Epoch 4/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0053 - mae: 0.0484 - val_loss: 0.0155 - val_mae: 0.1207
Epoch 5/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0038 - mae: 0.0426 - val_loss: 0.0185 - val_mae: 0.1322
Epoch 6/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0045 - mae: 0.0480 - val_loss: 0.0104 - val_mae: 0.0971
Epoch 7/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0040 - mae: 0.0443 - val_loss: 0.0117 - val_mae: 0.1046
Epoch 8/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0050 - mae: 0.0499 - val_loss: 0.0018 - val_mae: 0.0352
Epoch 9/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0036 - mae: 0.0423 - val_loss: 0.0019 - val_mae: 0.0366
Epoch 10/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0035 - mae: 0.0435 - val_loss: 0.0199 - val_mae: 0.1389
Epoch 11/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0053 - mae: 0.0491 - val_loss: 0.0018 - val_mae: 0.0378
Epoch 12/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0038 - mae: 0.0442 - val_loss: 0.0050 - val_mae: 0.0681
Epoch 13/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0034 - mae: 0.0449 - val_loss: 9.0919e-04 - val_mae: 0.0241
Epoch 14/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0040 - mae: 0.0457 - val_loss: 0.0097 - val_mae: 0.0945
Epoch 15/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0037 - mae: 0.0435 - val_loss: 8.1231e-04 - val_mae: 0.0241
Epoch 16/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0024 - mae: 0.0344 - val_loss: 0.0048 - val_mae: 0.0648
Epoch 17/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0047 - mae: 0.0522 - val_loss: 0.0063 - val_mae: 0.0755
Epoch 18/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0023 - mae: 0.0327 - val_loss: 0.0011 - val_mae: 0.0262
Epoch 19/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0027 - mae: 0.0386 - val_loss: 0.0019 - val_mae: 0.0400
Epoch 20/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0024 - mae: 0.0356 - val_loss: 0.0015 - val_mae: 0.0337
Epoch 21/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0021 - mae: 0.0333 - val_loss: 9.5373e-04 - val_mae: 0.0254
Epoch 22/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0027 - mae: 0.0390 - val_loss: 8.8964e-04 - val_mae: 0.0239
Epoch 23/50
489/489 [==============================] - 3s 7ms/step - loss: 0.0019 - mae: 0.0320 - val_loss: 5.5364e-04 - val_mae: 0.0193
Epoch 24/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0030 - mae: 0.0407 - val_loss: 0.0043 - val_mae: 0.0634
Epoch 25/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0023 - mae: 0.0363 - val_loss: 6.7585e-04 - val_mae: 0.0204
Epoch 26/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0033 - mae: 0.0430 - val_loss: 0.0077 - val_mae: 0.0850
Epoch 27/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0028 - mae: 0.0374 - val_loss: 0.0112 - val_mae: 0.1024
Epoch 28/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0038 - mae: 0.0457 - val_loss: 0.0012 - val_mae: 0.0267
Epoch 29/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0034 - mae: 0.0393 - val_loss: 0.0186 - val_mae: 0.1323
Epoch 30/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0029 - mae: 0.0377 - val_loss: 0.0032 - val_mae: 0.0463
Epoch 31/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0033 - mae: 0.0399 - val_loss: 7.6156e-04 - val_mae: 0.0210
Epoch 32/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0015 - mae: 0.0278 - val_loss: 0.0037 - val_mae: 0.0555
Epoch 33/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0020 - mae: 0.0325 - val_loss: 0.0025 - val_mae: 0.0424
Epoch 34/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0020 - mae: 0.0334 - val_loss: 0.0012 - val_mae: 0.0261
Epoch 35/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0025 - mae: 0.0381 - val_loss: 0.0014 - val_mae: 0.0345
Epoch 36/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0027 - mae: 0.0379 - val_loss: 3.4524e-04 - val_mae: 0.0144
Epoch 37/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0018 - mae: 0.0307 - val_loss: 0.0016 - val_mae: 0.0342
Epoch 38/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0017 - mae: 0.0282 - val_loss: 0.0014 - val_mae: 0.0344
Epoch 39/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0033 - mae: 0.0404 - val_loss: 0.0156 - val_mae: 0.1237
Epoch 40/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0017 - mae: 0.0308 - val_loss: 5.9440e-04 - val_mae: 0.0201
Epoch 41/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0019 - mae: 0.0316 - val_loss: 0.0087 - val_mae: 0.0925
Epoch 42/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0021 - mae: 0.0344 - val_loss: 0.0032 - val_mae: 0.0478
Epoch 43/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0012 - mae: 0.0265 - val_loss: 0.0015 - val_mae: 0.0325
Epoch 44/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0031 - mae: 0.0404 - val_loss: 0.0043 - val_mae: 0.0576
Epoch 45/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0023 - mae: 0.0356 - val_loss: 4.1848e-04 - val_mae: 0.0153
Epoch 46/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0016 - mae: 0.0283 - val_loss: 0.0087 - val_mae: 0.0860
Epoch 47/50
489/489 [==============================] - 3s 6ms/step - loss: 0.0024 - mae: 0.0375 - val_loss: 0.0014 - val_mae: 0.0356
Epoch 48/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0024 - mae: 0.0369 - val_loss: 8.4325e-04 - val_mae: 0.0257
Epoch 49/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0018 - mae: 0.0307 - val_loss: 0.0056 - val_mae: 0.0734
Epoch 50/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0027 - mae: 0.0351 - val_loss: 0.0012 - val_mae: 0.0271
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of one-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of one-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
rnn_prediction = scaler_y.inverse_transform(model_rnn.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), rnn_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (one-layer RNN model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()
In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_1=round(mean_MAE[0],4)
MAE_val_1=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_1)
print("The validation MAE of the model is", MAE_val_1)
The MAE of the model is 0.0408
The validation MAE of the model is 0.0602

b) Multi-layer RNN model¶

Below we are creating a naive multi layer RNN model.

In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 128
n_units2 = 64

#Create model
model_rnn_input = tf.keras.Input(shape=input_dim)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units, return_sequences=True, dropout=0.2 )(model_rnn_input)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units2, return_sequences=False, dropout=0.2)(model_rnn_hidden)
model_rnn_output = tf.keras.layers.Dense(units=1, activation='linear')(model_rnn_hidden)
model_rnn = tf.keras.Model(inputs=model_rnn_input, outputs=model_rnn_output, name="model_rnn")


#Print the model architecture
print(model_rnn.summary())

#Compile model
model_rnn.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_rnn.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_rnn"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 simple_rnn_1 (SimpleRNN)    (None, 14, 128)           80256     
                                                                 
 simple_rnn_2 (SimpleRNN)    (None, 64)                12352     
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                                 
=================================================================
Total params: 92,673
Trainable params: 92,673
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 6s 9ms/step - loss: 0.0792 - mae: 0.1850 - val_loss: 0.1080 - val_mae: 0.3281
Epoch 2/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0166 - mae: 0.0988 - val_loss: 0.0021 - val_mae: 0.0421
Epoch 3/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0100 - mae: 0.0734 - val_loss: 0.0062 - val_mae: 0.0768
Epoch 4/50
489/489 [==============================] - 5s 10ms/step - loss: 0.0103 - mae: 0.0730 - val_loss: 0.0494 - val_mae: 0.2205
Epoch 5/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0081 - mae: 0.0649 - val_loss: 0.0239 - val_mae: 0.1532
Epoch 6/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0091 - mae: 0.0678 - val_loss: 0.0047 - val_mae: 0.0670
Epoch 7/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0072 - mae: 0.0590 - val_loss: 8.1806e-04 - val_mae: 0.0247
Epoch 8/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0077 - mae: 0.0638 - val_loss: 0.0134 - val_mae: 0.1129
Epoch 9/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0079 - mae: 0.0601 - val_loss: 0.0292 - val_mae: 0.1700
Epoch 10/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0076 - mae: 0.0623 - val_loss: 0.0045 - val_mae: 0.0576
Epoch 11/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0060 - mae: 0.0534 - val_loss: 0.0175 - val_mae: 0.1221
Epoch 12/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0070 - mae: 0.0587 - val_loss: 0.0068 - val_mae: 0.0813
Epoch 13/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0062 - mae: 0.0584 - val_loss: 2.2526e-04 - val_mae: 0.0096
Epoch 14/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0091 - mae: 0.0657 - val_loss: 5.1515e-04 - val_mae: 0.0193
Epoch 15/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0084 - mae: 0.0658 - val_loss: 0.0035 - val_mae: 0.0556
Epoch 16/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0076 - mae: 0.0598 - val_loss: 0.0081 - val_mae: 0.0848
Epoch 17/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0060 - mae: 0.0520 - val_loss: 0.0073 - val_mae: 0.0842
Epoch 18/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0064 - mae: 0.0568 - val_loss: 5.8328e-04 - val_mae: 0.0223
Epoch 19/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0073 - mae: 0.0611 - val_loss: 0.0086 - val_mae: 0.0844
Epoch 20/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0068 - mae: 0.0575 - val_loss: 0.0072 - val_mae: 0.0811
Epoch 21/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0073 - mae: 0.0588 - val_loss: 0.0112 - val_mae: 0.0989
Epoch 22/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0068 - mae: 0.0592 - val_loss: 0.0113 - val_mae: 0.1016
Epoch 23/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0058 - mae: 0.0532 - val_loss: 0.0017 - val_mae: 0.0359
Epoch 24/50
489/489 [==============================] - 4s 9ms/step - loss: 0.0066 - mae: 0.0584 - val_loss: 0.0031 - val_mae: 0.0445
Epoch 25/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0053 - mae: 0.0502 - val_loss: 0.0127 - val_mae: 0.1117
Epoch 26/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0058 - mae: 0.0550 - val_loss: 0.0188 - val_mae: 0.1363
Epoch 27/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0072 - mae: 0.0600 - val_loss: 0.0114 - val_mae: 0.1016
Epoch 28/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0071 - mae: 0.0598 - val_loss: 0.0035 - val_mae: 0.0478
Epoch 29/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0064 - mae: 0.0557 - val_loss: 8.8842e-04 - val_mae: 0.0244
Epoch 30/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0052 - mae: 0.0502 - val_loss: 0.0082 - val_mae: 0.0889
Epoch 31/50
489/489 [==============================] - 5s 10ms/step - loss: 0.0074 - mae: 0.0605 - val_loss: 0.0064 - val_mae: 0.0723
Epoch 32/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0064 - mae: 0.0530 - val_loss: 0.0021 - val_mae: 0.0402
Epoch 33/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0047 - mae: 0.0458 - val_loss: 0.0022 - val_mae: 0.0431
Epoch 34/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0052 - mae: 0.0497 - val_loss: 0.0027 - val_mae: 0.0472
Epoch 35/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0056 - mae: 0.0525 - val_loss: 0.0118 - val_mae: 0.1080
Epoch 36/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0058 - mae: 0.0557 - val_loss: 0.0036 - val_mae: 0.0545
Epoch 37/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0056 - mae: 0.0503 - val_loss: 0.0073 - val_mae: 0.0801
Epoch 38/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0055 - mae: 0.0527 - val_loss: 0.0020 - val_mae: 0.0428
Epoch 39/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0048 - mae: 0.0485 - val_loss: 0.0105 - val_mae: 0.1017
Epoch 40/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0052 - mae: 0.0478 - val_loss: 0.0016 - val_mae: 0.0360
Epoch 41/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0056 - mae: 0.0518 - val_loss: 0.0080 - val_mae: 0.0875
Epoch 42/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0050 - mae: 0.0484 - val_loss: 0.0361 - val_mae: 0.1644
Epoch 43/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0077 - mae: 0.0603 - val_loss: 2.7831e-04 - val_mae: 0.0140
Epoch 44/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0047 - mae: 0.0462 - val_loss: 0.0087 - val_mae: 0.0879
Epoch 45/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0066 - mae: 0.0558 - val_loss: 0.0159 - val_mae: 0.1234
Epoch 46/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0054 - mae: 0.0493 - val_loss: 5.1348e-04 - val_mae: 0.0191
Epoch 47/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0049 - mae: 0.0482 - val_loss: 0.0012 - val_mae: 0.0284
Epoch 48/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0055 - mae: 0.0522 - val_loss: 0.0025 - val_mae: 0.0464
Epoch 49/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0068 - mae: 0.0593 - val_loss: 0.0308 - val_mae: 0.1746
Epoch 50/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0067 - mae: 0.0610 - val_loss: 9.1908e-04 - val_mae: 0.0254
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
rnn_prediction = scaler_y.inverse_transform(model_rnn.predict(x_test))

#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), rnn_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (multi-layer RNN model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()
In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_2=round(mean_MAE[0],4)
MAE_val_2=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_2)
print("The validation MAE of the model is", MAE_val_2)
The MAE of the model is 0.0601
The validation MAE of the model is 0.0817

4.6.2 California¶

Return to contents

a) One-layer RNN model¶

Return to contents

Below we replicated the steps above for the state of California.

In [ ]:
tf.keras.backend.clear_session()

# Using info New York
info_sel_state = sel_state("California")
info_train_test=train_test("California")

# Scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# minmax scaler for y, scaling between 0 and 1
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)

#Call x variables
xtrain=info_train_test[4]
xtrain.drop(["date", "JHU_cases"], axis=1, inplace=True)
xtest=info_train_test[5].copy()
xtest.drop(["date", "JHU_cases"], axis=1, inplace=True)

# time series generator uses 14 day block to predict the next day 
train_gen = TimeseriesGenerator(xtrain.to_numpy(), ytrain,
                               length=14, sampling_rate=1,  
                                batch_size = 558)

test_gen = TimeseriesGenerator(xtest.to_numpy(), ytest,
                                length=14, sampling_rate=1, 
                               batch_size = 186)

# Below we are creating x and y train and test variable from the generators
x_train, y_train = train_gen[0]
# x_val, y_val = val_gen[0]
x_test, y_test = test_gen[0]

print('x train shape is :', x_train.shape)
print('y train shape is :', y_train.shape)
print('x test shape is :' , x_test.shape)
print('y test shape is :' , y_test.shape)
x train shape is : (544, 14, 498)
y train shape is : (544, 1)
x test shape is : (172, 14, 498)
y test shape is : (172, 1)
In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 100

#Create model
model_rnn_input = tf.keras.Input(shape=input_dim)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units, return_sequences=False)(model_rnn_input)
model_rnn_output = tf.keras.layers.Dense(units=1, activation='linear')(model_rnn_hidden)
model_rnn = tf.keras.Model(inputs=model_rnn_input, outputs=model_rnn_output, name="model_rnn")

#Print the model architecture
print(model_rnn.summary())

#Compile model
model_rnn.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_rnn.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_rnn"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 simple_rnn (SimpleRNN)      (None, 100)               59900     
                                                                 
 dense (Dense)               (None, 1)                 101       
                                                                 
=================================================================
Total params: 60,001
Trainable params: 60,001
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0399 - mae: 0.1217 - val_loss: 0.0082 - val_mae: 0.0801
Epoch 2/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0092 - mae: 0.0701 - val_loss: 0.0023 - val_mae: 0.0368
Epoch 3/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0065 - mae: 0.0560 - val_loss: 0.0091 - val_mae: 0.0864
Epoch 4/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0043 - mae: 0.0456 - val_loss: 0.0016 - val_mae: 0.0299
Epoch 5/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0049 - mae: 0.0528 - val_loss: 0.0011 - val_mae: 0.0264
Epoch 6/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0048 - mae: 0.0484 - val_loss: 0.0160 - val_mae: 0.1221
Epoch 7/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0055 - mae: 0.0512 - val_loss: 0.0246 - val_mae: 0.1550
Epoch 8/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0042 - mae: 0.0468 - val_loss: 0.0015 - val_mae: 0.0287
Epoch 9/50
489/489 [==============================] - 2s 4ms/step - loss: 0.0057 - mae: 0.0569 - val_loss: 0.0125 - val_mae: 0.1101
Epoch 10/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0064 - mae: 0.0569 - val_loss: 7.1169e-04 - val_mae: 0.0208
Epoch 11/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0045 - mae: 0.0463 - val_loss: 5.9596e-04 - val_mae: 0.0153
Epoch 12/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0051 - mae: 0.0503 - val_loss: 9.2196e-04 - val_mae: 0.0248
Epoch 13/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0062 - mae: 0.0560 - val_loss: 0.0011 - val_mae: 0.0258
Epoch 14/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0063 - mae: 0.0558 - val_loss: 0.0026 - val_mae: 0.0480
Epoch 15/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0054 - mae: 0.0536 - val_loss: 6.1158e-04 - val_mae: 0.0149
Epoch 16/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0044 - mae: 0.0477 - val_loss: 0.0060 - val_mae: 0.0749
Epoch 17/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0046 - mae: 0.0482 - val_loss: 0.0014 - val_mae: 0.0306
Epoch 18/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0038 - mae: 0.0435 - val_loss: 0.0018 - val_mae: 0.0391
Epoch 19/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0039 - mae: 0.0449 - val_loss: 9.7040e-04 - val_mae: 0.0267
Epoch 20/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0025 - mae: 0.0362 - val_loss: 0.0019 - val_mae: 0.0390
Epoch 21/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0041 - mae: 0.0423 - val_loss: 0.0032 - val_mae: 0.0535
Epoch 22/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0029 - mae: 0.0377 - val_loss: 6.6264e-04 - val_mae: 0.0177
Epoch 23/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0036 - mae: 0.0428 - val_loss: 6.9806e-04 - val_mae: 0.0194
Epoch 24/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0031 - mae: 0.0395 - val_loss: 0.0011 - val_mae: 0.0292
Epoch 25/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0032 - mae: 0.0421 - val_loss: 9.7163e-04 - val_mae: 0.0241
Epoch 26/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0021 - mae: 0.0331 - val_loss: 0.0020 - val_mae: 0.0404
Epoch 27/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0032 - mae: 0.0417 - val_loss: 0.0019 - val_mae: 0.0383
Epoch 28/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0028 - mae: 0.0388 - val_loss: 0.0018 - val_mae: 0.0368
Epoch 29/50
489/489 [==============================] - 3s 5ms/step - loss: 0.0035 - mae: 0.0429 - val_loss: 0.0021 - val_mae: 0.0436
Epoch 30/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0038 - mae: 0.0459 - val_loss: 0.0055 - val_mae: 0.0690
Epoch 31/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0024 - mae: 0.0339 - val_loss: 0.0013 - val_mae: 0.0292
Epoch 32/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0022 - mae: 0.0343 - val_loss: 0.0027 - val_mae: 0.0489
Epoch 33/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0018 - mae: 0.0297 - val_loss: 8.4147e-04 - val_mae: 0.0216
Epoch 34/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0026 - mae: 0.0364 - val_loss: 6.7294e-04 - val_mae: 0.0187
Epoch 35/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0022 - mae: 0.0359 - val_loss: 4.6460e-04 - val_mae: 0.0114
Epoch 36/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0048 - mae: 0.0496 - val_loss: 0.0026 - val_mae: 0.0480
Epoch 37/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0023 - mae: 0.0337 - val_loss: 5.8362e-04 - val_mae: 0.0157
Epoch 38/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0027 - mae: 0.0377 - val_loss: 0.0016 - val_mae: 0.0349
Epoch 39/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0019 - mae: 0.0316 - val_loss: 0.0011 - val_mae: 0.0254
Epoch 40/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0057 - mae: 0.0528 - val_loss: 0.0235 - val_mae: 0.1512
Epoch 41/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0032 - mae: 0.0426 - val_loss: 0.0018 - val_mae: 0.0377
Epoch 42/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0018 - mae: 0.0319 - val_loss: 0.0014 - val_mae: 0.0325
Epoch 43/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0018 - mae: 0.0315 - val_loss: 6.1982e-04 - val_mae: 0.0171
Epoch 44/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0025 - mae: 0.0392 - val_loss: 0.0015 - val_mae: 0.0353
Epoch 45/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0017 - mae: 0.0307 - val_loss: 0.0076 - val_mae: 0.0838
Epoch 46/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0029 - mae: 0.0410 - val_loss: 7.9019e-04 - val_mae: 0.0200
Epoch 47/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0021 - mae: 0.0340 - val_loss: 6.0138e-04 - val_mae: 0.0147
Epoch 48/50
489/489 [==============================] - 2s 4ms/step - loss: 0.0025 - mae: 0.0381 - val_loss: 6.9023e-04 - val_mae: 0.0197
Epoch 49/50
489/489 [==============================] - 2s 5ms/step - loss: 0.0021 - mae: 0.0351 - val_loss: 6.9399e-04 - val_mae: 0.0180
Epoch 50/50
489/489 [==============================] - 2s 4ms/step - loss: 0.0019 - mae: 0.0331 - val_loss: 0.0037 - val_mae: 0.0577
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of one-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of one-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
rnn_prediction = scaler_y.inverse_transform(model_rnn.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), rnn_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (one-layer RNN model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()
In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_3=round(mean_MAE[0],4)
MAE_val_3=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_3)
print("The validation MAE of the model is", MAE_val_3)
The MAE of the model is 0.0446
The validation MAE of the model is 0.043

b) Multi-layer RNN model¶

In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 128
n_units2 = 64

#Create model
model_rnn_input = tf.keras.Input(shape=input_dim)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units, return_sequences=True, dropout=0.2 )(model_rnn_input)
model_rnn_hidden = tf.keras.layers.SimpleRNN(units=n_units2, return_sequences=False, dropout=0.2)(model_rnn_hidden)
model_rnn_output = tf.keras.layers.Dense(units=1, activation='linear')(model_rnn_hidden)
model_rnn = tf.keras.Model(inputs=model_rnn_input, outputs=model_rnn_output, name="model_rnn")

#Print the model architecture
print(model_rnn.summary())

#Compile model
model_rnn.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_rnn.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_rnn"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 simple_rnn_1 (SimpleRNN)    (None, 14, 128)           80256     
                                                                 
 simple_rnn_2 (SimpleRNN)    (None, 64)                12352     
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                                 
=================================================================
Total params: 92,673
Trainable params: 92,673
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 6s 9ms/step - loss: 0.0828 - mae: 0.1981 - val_loss: 0.0043 - val_mae: 0.0634
Epoch 2/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0158 - mae: 0.0874 - val_loss: 0.0022 - val_mae: 0.0445
Epoch 3/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0105 - mae: 0.0731 - val_loss: 0.0025 - val_mae: 0.0471
Epoch 4/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0088 - mae: 0.0646 - val_loss: 9.3596e-04 - val_mae: 0.0270
Epoch 5/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0075 - mae: 0.0586 - val_loss: 0.0018 - val_mae: 0.0395
Epoch 6/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0109 - mae: 0.0737 - val_loss: 0.0031 - val_mae: 0.0533
Epoch 7/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0102 - mae: 0.0683 - val_loss: 0.0050 - val_mae: 0.0681
Epoch 8/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0104 - mae: 0.0716 - val_loss: 0.0094 - val_mae: 0.0949
Epoch 9/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0191 - mae: 0.0878 - val_loss: 0.1451 - val_mae: 0.3804
Epoch 10/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0160 - mae: 0.0839 - val_loss: 0.0012 - val_mae: 0.0325
Epoch 11/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0108 - mae: 0.0713 - val_loss: 0.0013 - val_mae: 0.0318
Epoch 12/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0081 - mae: 0.0627 - val_loss: 0.0011 - val_mae: 0.0300
Epoch 13/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0079 - mae: 0.0634 - val_loss: 0.0064 - val_mae: 0.0773
Epoch 14/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0096 - mae: 0.0687 - val_loss: 5.1149e-04 - val_mae: 0.0129
Epoch 15/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0069 - mae: 0.0568 - val_loss: 0.0016 - val_mae: 0.0364
Epoch 16/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0089 - mae: 0.0653 - val_loss: 0.0055 - val_mae: 0.0679
Epoch 17/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0068 - mae: 0.0588 - val_loss: 7.2237e-04 - val_mae: 0.0226
Epoch 18/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0083 - mae: 0.0617 - val_loss: 0.0012 - val_mae: 0.0308
Epoch 19/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0096 - mae: 0.0646 - val_loss: 0.0083 - val_mae: 0.0890
Epoch 20/50
489/489 [==============================] - 4s 9ms/step - loss: 0.0083 - mae: 0.0624 - val_loss: 7.3189e-04 - val_mae: 0.0224
Epoch 21/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0067 - mae: 0.0559 - val_loss: 6.5925e-04 - val_mae: 0.0210
Epoch 22/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0112 - mae: 0.0740 - val_loss: 0.0032 - val_mae: 0.0541
Epoch 23/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0097 - mae: 0.0696 - val_loss: 0.0073 - val_mae: 0.0832
Epoch 24/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0094 - mae: 0.0698 - val_loss: 0.0026 - val_mae: 0.0486
Epoch 25/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0084 - mae: 0.0648 - val_loss: 0.0038 - val_mae: 0.0592
Epoch 26/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0108 - mae: 0.0661 - val_loss: 0.0049 - val_mae: 0.0675
Epoch 27/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0181 - mae: 0.0907 - val_loss: 0.0088 - val_mae: 0.0916
Epoch 28/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0088 - mae: 0.0685 - val_loss: 0.0014 - val_mae: 0.0355
Epoch 29/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0096 - mae: 0.0681 - val_loss: 4.8963e-04 - val_mae: 0.0155
Epoch 30/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0067 - mae: 0.0591 - val_loss: 4.3153e-04 - val_mae: 0.0128
Epoch 31/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0072 - mae: 0.0631 - val_loss: 0.0011 - val_mae: 0.0299
Epoch 32/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0076 - mae: 0.0631 - val_loss: 0.0082 - val_mae: 0.0880
Epoch 33/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0098 - mae: 0.0680 - val_loss: 9.0314e-04 - val_mae: 0.0266
Epoch 34/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0107 - mae: 0.0728 - val_loss: 6.7268e-04 - val_mae: 0.0207
Epoch 35/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0102 - mae: 0.0713 - val_loss: 4.1960e-04 - val_mae: 0.0102
Epoch 36/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0072 - mae: 0.0611 - val_loss: 0.0024 - val_mae: 0.0464
Epoch 37/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0118 - mae: 0.0756 - val_loss: 0.0049 - val_mae: 0.0673
Epoch 38/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0121 - mae: 0.0803 - val_loss: 0.0031 - val_mae: 0.0533
Epoch 39/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0121 - mae: 0.0815 - val_loss: 0.0250 - val_mae: 0.1569
Epoch 40/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0105 - mae: 0.0741 - val_loss: 0.0017 - val_mae: 0.0383
Epoch 41/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0089 - mae: 0.0676 - val_loss: 6.2606e-04 - val_mae: 0.0174
Epoch 42/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0117 - mae: 0.0762 - val_loss: 4.3084e-04 - val_mae: 0.0104
Epoch 43/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0129 - mae: 0.0760 - val_loss: 7.5056e-04 - val_mae: 0.0233
Epoch 44/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0067 - mae: 0.0595 - val_loss: 0.0069 - val_mae: 0.0805
Epoch 45/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0083 - mae: 0.0657 - val_loss: 7.2243e-04 - val_mae: 0.0212
Epoch 46/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0093 - mae: 0.0657 - val_loss: 0.0018 - val_mae: 0.0372
Epoch 47/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0062 - mae: 0.0554 - val_loss: 0.0016 - val_mae: 0.0371
Epoch 48/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0129 - mae: 0.0760 - val_loss: 0.0062 - val_mae: 0.0763
Epoch 49/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0109 - mae: 0.0720 - val_loss: 0.0020 - val_mae: 0.0421
Epoch 50/50
489/489 [==============================] - 4s 8ms/step - loss: 0.0066 - mae: 0.0600 - val_loss: 0.0059 - val_mae: 0.0742
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer RNN model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
rnn_prediction = scaler_y.inverse_transform(model_rnn.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), rnn_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (multi-layer RNN model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()
In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_4=round(mean_MAE[0],4)
MAE_val_4=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_4)
print("The validation MAE of the model is", MAE_val_4)
The MAE of the model is 0.0715
The validation MAE of the model is 0.0544

4.7. LSTM models¶

Return to contents

The next natural step was to start experimenting with LSTM models, which have proven to be very effective in predicting time series.

4.7.1 New York¶

Return to contents

a) One-layer LSTM model¶

Return to contents

We started with a one-layer model, applying our train and test datasets as time series. Below is the naive LSTM model of the NY State, using one layer as a baseline model.

In [ ]:
tf.keras.backend.clear_session()

# Using info New York
info_sel_state = sel_state("New York")
info_train_test=train_test("New York")

# Scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# minmax scaler for y, scaling between 0 and 1
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)

#Call x variables
xtrain=info_train_test[4]
xtrain.drop(["date", "JHU_cases"], axis=1, inplace=True)
xtest=info_train_test[5].copy()
xtest.drop(["date", "JHU_cases"], axis=1, inplace=True)

# time series generator uses 14 day block to predict the next day 
train_gen = TimeseriesGenerator(xtrain.to_numpy(), ytrain,
                               length=14, sampling_rate=1,  
                                batch_size = 558)

test_gen = TimeseriesGenerator(xtest.to_numpy(), ytest,
                                length=14, sampling_rate=1, 
                               batch_size = 186)

# Below we are creating x and y train and test variable from the generators
x_train, y_train = train_gen[0]
# x_val, y_val = val_gen[0]
x_test, y_test = test_gen[0]

print('x train shape is :', x_train.shape)
print('y train shape is :', y_train.shape)
print('x test shape is :' , x_test.shape)
print('y test shape is :' , y_test.shape)
x train shape is : (544, 14, 498)
y train shape is : (544, 1)
x test shape is : (172, 14, 498)
y test shape is : (172, 1)
In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 100

#Create model
model_lstm_input = tf.keras.Input(shape=input_dim)
model_hidden=tf.keras.layers.LSTM(units = n_units)(model_lstm_input)
model_lstm_output=tf.keras.layers.Dense(units = 1, activation="linear")(model_hidden)
model_lstm = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm")


#Print the model architecture
print(model_lstm.summary())

#Compile model
model_lstm.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_lstm.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_lstm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 100)               239600    
                                                                 
 dense (Dense)               (None, 1)                 101       
                                                                 
=================================================================
Total params: 239,701
Trainable params: 239,701
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 9s 14ms/step - loss: 0.0278 - mae: 0.0851 - val_loss: 0.0233 - val_mae: 0.1460
Epoch 2/50
489/489 [==============================] - 6s 13ms/step - loss: 0.0044 - mae: 0.0436 - val_loss: 0.0032 - val_mae: 0.0474
Epoch 3/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0034 - mae: 0.0374 - val_loss: 0.0062 - val_mae: 0.0737
Epoch 4/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0033 - mae: 0.0383 - val_loss: 0.0031 - val_mae: 0.0500
Epoch 5/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0037 - mae: 0.0403 - val_loss: 0.0065 - val_mae: 0.0765
Epoch 6/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0025 - mae: 0.0299 - val_loss: 0.0156 - val_mae: 0.1212
Epoch 7/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0024 - mae: 0.0319 - val_loss: 0.0015 - val_mae: 0.0301
Epoch 8/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0022 - mae: 0.0307 - val_loss: 0.0042 - val_mae: 0.0584
Epoch 9/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0021 - mae: 0.0323 - val_loss: 0.0032 - val_mae: 0.0497
Epoch 10/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0020 - mae: 0.0303 - val_loss: 8.7140e-04 - val_mae: 0.0221
Epoch 11/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0020 - mae: 0.0294 - val_loss: 0.0041 - val_mae: 0.0585
Epoch 12/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0018 - mae: 0.0269 - val_loss: 0.0058 - val_mae: 0.0718
Epoch 13/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0017 - mae: 0.0282 - val_loss: 0.0015 - val_mae: 0.0307
Epoch 14/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0014 - mae: 0.0262 - val_loss: 0.0026 - val_mae: 0.0443
Epoch 15/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0014 - mae: 0.0249 - val_loss: 0.0031 - val_mae: 0.0480
Epoch 16/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0012 - mae: 0.0252 - val_loss: 0.0042 - val_mae: 0.0593
Epoch 17/50
489/489 [==============================] - 6s 12ms/step - loss: 9.7890e-04 - mae: 0.0233 - val_loss: 0.0029 - val_mae: 0.0458
Epoch 18/50
489/489 [==============================] - 6s 12ms/step - loss: 8.3854e-04 - mae: 0.0211 - val_loss: 0.0032 - val_mae: 0.0519
Epoch 19/50
489/489 [==============================] - 6s 12ms/step - loss: 9.5015e-04 - mae: 0.0224 - val_loss: 0.0014 - val_mae: 0.0297
Epoch 20/50
489/489 [==============================] - 6s 12ms/step - loss: 8.4273e-04 - mae: 0.0210 - val_loss: 0.0026 - val_mae: 0.0440
Epoch 21/50
489/489 [==============================] - 6s 12ms/step - loss: 4.6586e-04 - mae: 0.0165 - val_loss: 0.0013 - val_mae: 0.0285
Epoch 22/50
489/489 [==============================] - 6s 12ms/step - loss: 9.0573e-04 - mae: 0.0210 - val_loss: 0.0040 - val_mae: 0.0592
Epoch 23/50
489/489 [==============================] - 6s 12ms/step - loss: 8.1101e-04 - mae: 0.0209 - val_loss: 0.0034 - val_mae: 0.0540
Epoch 24/50
489/489 [==============================] - 6s 13ms/step - loss: 5.1372e-04 - mae: 0.0164 - val_loss: 0.0038 - val_mae: 0.0582
Epoch 25/50
489/489 [==============================] - 6s 13ms/step - loss: 6.1445e-04 - mae: 0.0182 - val_loss: 0.0035 - val_mae: 0.0547
Epoch 26/50
489/489 [==============================] - 6s 12ms/step - loss: 4.7087e-04 - mae: 0.0163 - val_loss: 0.0021 - val_mae: 0.0407
Epoch 27/50
489/489 [==============================] - 6s 12ms/step - loss: 7.0810e-04 - mae: 0.0189 - val_loss: 0.0015 - val_mae: 0.0321
Epoch 28/50
489/489 [==============================] - 6s 12ms/step - loss: 3.4888e-04 - mae: 0.0141 - val_loss: 0.0018 - val_mae: 0.0372
Epoch 29/50
489/489 [==============================] - 6s 13ms/step - loss: 3.6345e-04 - mae: 0.0146 - val_loss: 0.0032 - val_mae: 0.0524
Epoch 30/50
489/489 [==============================] - 6s 13ms/step - loss: 3.9726e-04 - mae: 0.0147 - val_loss: 0.0038 - val_mae: 0.0581
Epoch 31/50
489/489 [==============================] - 6s 13ms/step - loss: 5.6201e-04 - mae: 0.0168 - val_loss: 0.0041 - val_mae: 0.0605
Epoch 32/50
489/489 [==============================] - 6s 13ms/step - loss: 4.3212e-04 - mae: 0.0152 - val_loss: 5.6658e-04 - val_mae: 0.0180
Epoch 33/50
489/489 [==============================] - 6s 13ms/step - loss: 4.8134e-04 - mae: 0.0158 - val_loss: 0.0021 - val_mae: 0.0427
Epoch 34/50
489/489 [==============================] - 6s 13ms/step - loss: 3.9820e-04 - mae: 0.0145 - val_loss: 0.0019 - val_mae: 0.0389
Epoch 35/50
489/489 [==============================] - 6s 13ms/step - loss: 2.9571e-04 - mae: 0.0128 - val_loss: 0.0013 - val_mae: 0.0314
Epoch 36/50
489/489 [==============================] - 6s 12ms/step - loss: 3.5831e-04 - mae: 0.0140 - val_loss: 0.0012 - val_mae: 0.0296
Epoch 37/50
489/489 [==============================] - 6s 12ms/step - loss: 2.4895e-04 - mae: 0.0119 - val_loss: 0.0011 - val_mae: 0.0282
Epoch 38/50
489/489 [==============================] - 6s 13ms/step - loss: 5.5265e-04 - mae: 0.0162 - val_loss: 9.3092e-04 - val_mae: 0.0261
Epoch 39/50
489/489 [==============================] - 6s 13ms/step - loss: 3.0301e-04 - mae: 0.0128 - val_loss: 0.0037 - val_mae: 0.0577
Epoch 40/50
489/489 [==============================] - 6s 12ms/step - loss: 2.4097e-04 - mae: 0.0115 - val_loss: 0.0031 - val_mae: 0.0523
Epoch 41/50
489/489 [==============================] - 6s 13ms/step - loss: 3.1027e-04 - mae: 0.0121 - val_loss: 0.0029 - val_mae: 0.0505
Epoch 42/50
489/489 [==============================] - 6s 13ms/step - loss: 4.0832e-04 - mae: 0.0138 - val_loss: 0.0013 - val_mae: 0.0319
Epoch 43/50
489/489 [==============================] - 6s 12ms/step - loss: 1.7997e-04 - mae: 0.0099 - val_loss: 0.0029 - val_mae: 0.0510
Epoch 44/50
489/489 [==============================] - 6s 12ms/step - loss: 1.7004e-04 - mae: 0.0098 - val_loss: 0.0022 - val_mae: 0.0435
Epoch 45/50
489/489 [==============================] - 6s 12ms/step - loss: 2.7693e-04 - mae: 0.0122 - val_loss: 0.0020 - val_mae: 0.0403
Epoch 46/50
489/489 [==============================] - 6s 12ms/step - loss: 3.3876e-04 - mae: 0.0132 - val_loss: 0.0042 - val_mae: 0.0623
Epoch 47/50
489/489 [==============================] - 6s 12ms/step - loss: 2.1596e-04 - mae: 0.0107 - val_loss: 0.0019 - val_mae: 0.0404
Epoch 48/50
489/489 [==============================] - 6s 13ms/step - loss: 3.8830e-04 - mae: 0.0131 - val_loss: 0.0024 - val_mae: 0.0453
Epoch 49/50
489/489 [==============================] - 6s 13ms/step - loss: 2.6645e-04 - mae: 0.0115 - val_loss: 0.0055 - val_mae: 0.0711
Epoch 50/50
489/489 [==============================] - 6s 12ms/step - loss: 4.7244e-04 - mae: 0.0143 - val_loss: 0.0016 - val_mae: 0.0362
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of one-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of one-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
lstm_prediction = scaler_y.inverse_transform(model_lstm.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), lstm_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (one-layer LSTM model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The result of this baseline, one-layer model looks promising but not good enough.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_5=round(mean_MAE[0],4)
MAE_val_5=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_5)
print("The validation MAE of the model is", MAE_val_5)
The MAE of the model is 0.0216
The validation MAE of the model is 0.0498

b) Multi-layer LSTM model¶

Return to contents

Then we tested adding more layers. Below is the multi-layer LSTM model of the NY State, using time series.

In [ ]:
### Multilayer LSTM model
#Design following: Iberoamerican Journal of Medicine
tf.keras.backend.clear_session()

# input dimension below
input_dim = x_train.shape[1:]
n_units= 50

#Create model
model_lstm_input = tf.keras.Input(shape=input_dim)
model_hidden1=tf.keras.layers.LSTM(units = n_units, return_sequences=True)(model_lstm_input)
model_dropout1=tf.keras.layers.Dropout(rate=0.2)(model_hidden1)
model_hidden2=tf.keras.layers.LSTM(units = n_units, return_sequences=True)(model_dropout1)
model_dropout2=tf.keras.layers.Dropout(rate=0.2)(model_hidden2)
model_hidden3=tf.keras.layers.LSTM(units = n_units, return_sequences=False)(model_dropout2)
model_dropout3=tf.keras.layers.Dropout(rate=0.2)(model_hidden3)
model_lstm_output=tf.keras.layers.Dense(units = 1, activation="linear")(model_dropout3)
model_lstm = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm")

#Print the model architecture
print(model_lstm.summary())

#Compile model
model_lstm.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_lstm.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_lstm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 50)            109800    
                                                                 
 dropout (Dropout)           (None, 14, 50)            0         
                                                                 
 lstm_1 (LSTM)               (None, 14, 50)            20200     
                                                                 
 dropout_1 (Dropout)         (None, 14, 50)            0         
                                                                 
 lstm_2 (LSTM)               (None, 50)                20200     
                                                                 
 dropout_2 (Dropout)         (None, 50)                0         
                                                                 
 dense (Dense)               (None, 1)                 51        
                                                                 
=================================================================
Total params: 150,251
Trainable params: 150,251
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 16s 22ms/step - loss: 0.0129 - mae: 0.0810 - val_loss: 0.0074 - val_mae: 0.0856
Epoch 2/50
489/489 [==============================] - 9s 17ms/step - loss: 0.0074 - mae: 0.0566 - val_loss: 0.0071 - val_mae: 0.0838
Epoch 3/50
489/489 [==============================] - 9s 17ms/step - loss: 0.0056 - mae: 0.0477 - val_loss: 0.0208 - val_mae: 0.1440
Epoch 4/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0052 - mae: 0.0450 - val_loss: 0.0050 - val_mae: 0.0705
Epoch 5/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0049 - mae: 0.0446 - val_loss: 0.0061 - val_mae: 0.0779
Epoch 6/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0044 - mae: 0.0407 - val_loss: 0.0078 - val_mae: 0.0879
Epoch 7/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0046 - mae: 0.0426 - val_loss: 0.0117 - val_mae: 0.1074
Epoch 8/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0039 - mae: 0.0377 - val_loss: 0.0025 - val_mae: 0.0492
Epoch 9/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0045 - mae: 0.0426 - val_loss: 0.0052 - val_mae: 0.0716
Epoch 10/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0039 - mae: 0.0373 - val_loss: 0.0023 - val_mae: 0.0472
Epoch 11/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0042 - mae: 0.0394 - val_loss: 7.9464e-05 - val_mae: 0.0072
Epoch 12/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0042 - mae: 0.0401 - val_loss: 0.0011 - val_mae: 0.0320
Epoch 13/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0034 - mae: 0.0368 - val_loss: 0.0074 - val_mae: 0.0856
Epoch 14/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0039 - mae: 0.0378 - val_loss: 0.0019 - val_mae: 0.0422
Epoch 15/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0031 - mae: 0.0324 - val_loss: 2.5212e-04 - val_mae: 0.0141
Epoch 16/50
489/489 [==============================] - 9s 17ms/step - loss: 0.0037 - mae: 0.0387 - val_loss: 0.0024 - val_mae: 0.0483
Epoch 17/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0038 - mae: 0.0387 - val_loss: 0.0046 - val_mae: 0.0670
Epoch 18/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0029 - mae: 0.0328 - val_loss: 0.0063 - val_mae: 0.0785
Epoch 19/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0032 - mae: 0.0349 - val_loss: 0.0068 - val_mae: 0.0815
Epoch 20/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0033 - mae: 0.0362 - val_loss: 0.0088 - val_mae: 0.0921
Epoch 21/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0030 - mae: 0.0332 - val_loss: 0.0034 - val_mae: 0.0566
Epoch 22/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0029 - mae: 0.0319 - val_loss: 0.0018 - val_mae: 0.0407
Epoch 23/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0030 - mae: 0.0330 - val_loss: 0.0211 - val_mae: 0.1433
Epoch 24/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0029 - mae: 0.0334 - val_loss: 5.1619e-04 - val_mae: 0.0207
Epoch 25/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0033 - mae: 0.0352 - val_loss: 2.2611e-04 - val_mae: 0.0133
Epoch 26/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0029 - mae: 0.0334 - val_loss: 7.4860e-04 - val_mae: 0.0255
Epoch 27/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0028 - mae: 0.0325 - val_loss: 0.0040 - val_mae: 0.0619
Epoch 28/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0024 - mae: 0.0288 - val_loss: 7.3414e-04 - val_mae: 0.0254
Epoch 29/50
489/489 [==============================] - 9s 17ms/step - loss: 0.0031 - mae: 0.0350 - val_loss: 4.8799e-04 - val_mae: 0.0203
Epoch 30/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0028 - mae: 0.0321 - val_loss: 5.7452e-04 - val_mae: 0.0221
Epoch 31/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0314 - val_loss: 0.0024 - val_mae: 0.0473
Epoch 32/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0306 - val_loss: 0.0019 - val_mae: 0.0418
Epoch 33/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0023 - mae: 0.0283 - val_loss: 0.0021 - val_mae: 0.0437
Epoch 34/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0023 - mae: 0.0301 - val_loss: 0.0025 - val_mae: 0.0491
Epoch 35/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0321 - val_loss: 0.0015 - val_mae: 0.0377
Epoch 36/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0321 - val_loss: 1.5012e-04 - val_mae: 0.0081
Epoch 37/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0324 - val_loss: 0.0012 - val_mae: 0.0330
Epoch 38/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0022 - mae: 0.0281 - val_loss: 6.7038e-04 - val_mae: 0.0247
Epoch 39/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0030 - mae: 0.0328 - val_loss: 6.1319e-04 - val_mae: 0.0233
Epoch 40/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0025 - mae: 0.0302 - val_loss: 0.0021 - val_mae: 0.0446
Epoch 41/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0024 - mae: 0.0295 - val_loss: 0.0015 - val_mae: 0.0384
Epoch 42/50
489/489 [==============================] - 9s 17ms/step - loss: 0.0025 - mae: 0.0284 - val_loss: 7.0564e-04 - val_mae: 0.0249
Epoch 43/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0024 - mae: 0.0284 - val_loss: 5.2689e-04 - val_mae: 0.0213
Epoch 44/50
489/489 [==============================] - 9s 18ms/step - loss: 0.0026 - mae: 0.0315 - val_loss: 0.0041 - val_mae: 0.0626
Epoch 45/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0024 - mae: 0.0301 - val_loss: 7.1488e-04 - val_mae: 0.0252
Epoch 46/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0023 - mae: 0.0290 - val_loss: 4.5437e-04 - val_mae: 0.0195
Epoch 47/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0020 - mae: 0.0258 - val_loss: 0.0010 - val_mae: 0.0301
Epoch 48/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0015 - mae: 0.0258 - val_loss: 8.8815e-04 - val_mae: 0.0282
Epoch 49/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0025 - mae: 0.0307 - val_loss: 0.0016 - val_mae: 0.0381
Epoch 50/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0019 - mae: 0.0266 - val_loss: 5.8761e-04 - val_mae: 0.0222
In [ ]:
# Plotting the loss and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
lstm_prediction = scaler_y.inverse_transform(model_lstm.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), lstm_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction with the multi-layer model does not show much improvement over the previous one-layer model. We tried different values of the hyperparameters: number of units, optimizer, and longer training cycle. We also applied cross-validation without success.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_6=round(mean_MAE[0],4)
MAE_val_6=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_6)
print("The validation MAE of the model is", MAE_val_6)
The MAE of the model is 0.0353
The validation MAE of the model is 0.0493

c) Two-layer LSTM model Using Time Steps¶

Return to contents \

For the following models, we used time steps applying the use of a sliding window to the data. To feed the LSTM models, we pre-processed our train and test datasets as time step series with a look-back of 14 days and a horizon of one day in the future. Below is a two-layer LSTM using time steps applied to the NY state data. [8]

In [ ]:
# Function to create datasets for LSTM as time steps
def format_series(n_future,n_past,df):
  # Creates series of multi-step on the dataset based on n_past periods
  # and adds the corresponding predicted values based on n_future periods
    trainX=[]
    trainY=[]
    for i in range(n_past,len(df)-n_future+1):
        trainX.append(df.iloc[i-n_past:i,1:df.shape[1]])#All features except y
        trainY.append(df.iloc[i+n_future-1:i+n_future,0])#y

    return np.array(trainX),np.array(trainY)  
In [ ]:
#Using info New York
info_sel_state = sel_state("New York")
info_train_test=train_test("New York")

#x and y train and test variables
x_train=info_train_test[4]
x_train.drop(["date"], axis=1, inplace=True)
y_train=info_train_test[1]
x_test=info_train_test[5]
x_test.drop(["date"], axis=1, inplace=True)
y_test=info_train_test[3]

# Datasets shape
print("\nThe shape of xtrain is", x_train.shape)
print("The shape of ytrain is", y_train.shape)

print("\nThe shape of xtest is", x_test.shape)
print("The shape of ytest is", y_test.shape)
The shape of xtrain is (558, 499)
The shape of ytrain is (558,)

The shape of xtest is (186, 499)
The shape of ytest is (186,)
In [ ]:
# Generating Time Step Series
xtrain,ytrain = format_series(1,14,x_train) 
xtest,ytest = format_series(1,14,x_test) 

# Checking the test and train data shapes
print("\nThe shape of xtrain is", xtrain.shape)
print("The shape of ytrain is", ytrain.shape)

print("\nThe shape of xtest is", xtest.shape)
print("The shape of ytest is", ytest.shape)
The shape of xtrain is (544, 14, 498)
The shape of ytrain is (544, 1)

The shape of xtest is (172, 14, 498)
The shape of ytest is (172, 1)
In [ ]:
tf.keras.backend.clear_session()

n_units=64

# Model architecture with 2 LSTM layers
model_lstm_input = tf.keras.Input(shape=(xtrain.shape[1], xtrain.shape[2]))
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_lstm_input)
model_hidden=tf.keras.layers.LSTM(units = n_units//2, activation='relu', return_sequences=False)(model_hidden)
model_hidden=tf.keras.layers.Dropout(0.2)(model_hidden)
model_lstm_output=tf.keras.layers.Dense(units = 1)(model_hidden)

model_lstm_shift = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm_time_steps")

#Print the model architecture
print(model_lstm_shift.summary())

#Compile model
model_lstm_shift.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Fit model
model_lstm_shift_results=model_lstm_shift.fit(xtrain, ytrain, 
                                               validation_split=0.2, 
                                               epochs = 20,
                                               batch_size = 32)

#Predict with model
model_lstm_shift_predict=model_lstm_shift.predict(xtest)
Model: "model_lstm_time_steps"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 64)            144128    
                                                                 
 lstm_1 (LSTM)               (None, 32)                12416     
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense (Dense)               (None, 1)                 33        
                                                                 
=================================================================
Total params: 156,577
Trainable params: 156,577
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/20
14/14 [==============================] - 3s 65ms/step - loss: 38032584.0000 - mae: 4083.8735 - val_loss: 11794229.0000 - val_mae: 2279.8066
Epoch 2/20
14/14 [==============================] - 0s 34ms/step - loss: 28557522.0000 - mae: 3573.7761 - val_loss: 12371511.0000 - val_mae: 3248.2581
Epoch 3/20
14/14 [==============================] - 0s 32ms/step - loss: 20461796.0000 - mae: 3581.7529 - val_loss: 16567672.0000 - val_mae: 3728.1582
Epoch 4/20
14/14 [==============================] - 0s 34ms/step - loss: 21745238.0000 - mae: 3834.3528 - val_loss: 24279820.0000 - val_mae: 4448.5068
Epoch 5/20
14/14 [==============================] - 0s 32ms/step - loss: 19951934.0000 - mae: 3684.2104 - val_loss: 28846816.0000 - val_mae: 4912.1665
Epoch 6/20
14/14 [==============================] - 0s 32ms/step - loss: 19773254.0000 - mae: 3657.8945 - val_loss: 13264450.0000 - val_mae: 3331.1821
Epoch 7/20
14/14 [==============================] - 0s 32ms/step - loss: 19561540.0000 - mae: 3442.4448 - val_loss: 22752744.0000 - val_mae: 4339.6514
Epoch 8/20
14/14 [==============================] - 0s 31ms/step - loss: 19618878.0000 - mae: 3611.3650 - val_loss: 16545329.0000 - val_mae: 3742.4451
Epoch 9/20
14/14 [==============================] - 0s 35ms/step - loss: 18294454.0000 - mae: 3458.4534 - val_loss: 14886128.0000 - val_mae: 3564.4922
Epoch 10/20
14/14 [==============================] - 0s 33ms/step - loss: 17201998.0000 - mae: 3161.6948 - val_loss: 9109334.0000 - val_mae: 2713.2698
Epoch 11/20
14/14 [==============================] - 0s 33ms/step - loss: 17597950.0000 - mae: 3079.6040 - val_loss: 26253044.0000 - val_mae: 4687.2319
Epoch 12/20
14/14 [==============================] - 0s 32ms/step - loss: 15511091.0000 - mae: 3206.5676 - val_loss: 16536207.0000 - val_mae: 3757.0591
Epoch 13/20
14/14 [==============================] - 0s 34ms/step - loss: 12338746.0000 - mae: 2518.1926 - val_loss: 5467000.0000 - val_mae: 2008.2953
Epoch 14/20
14/14 [==============================] - 0s 32ms/step - loss: 10940053.0000 - mae: 2020.4807 - val_loss: 7396951.5000 - val_mae: 2096.9988
Epoch 15/20
14/14 [==============================] - 0s 30ms/step - loss: 13998703.0000 - mae: 2606.4617 - val_loss: 7911093.5000 - val_mae: 2098.1912
Epoch 16/20
14/14 [==============================] - 0s 32ms/step - loss: 21096394.0000 - mae: 3092.3257 - val_loss: 11157249.0000 - val_mae: 3093.6624
Epoch 17/20
14/14 [==============================] - 0s 32ms/step - loss: 16157805.0000 - mae: 3178.9932 - val_loss: 18118036.0000 - val_mae: 3902.0889
Epoch 18/20
14/14 [==============================] - 0s 32ms/step - loss: 12782654.0000 - mae: 2657.1682 - val_loss: 10892875.0000 - val_mae: 3050.3984
Epoch 19/20
14/14 [==============================] - 0s 31ms/step - loss: 17097910.0000 - mae: 3306.7432 - val_loss: 8439723.0000 - val_mae: 2542.4348
Epoch 20/20
14/14 [==============================] - 0s 33ms/step - loss: 17280198.0000 - mae: 2764.5601 - val_loss: 8830552.0000 - val_mae: 2760.7988
In [ ]:
# Plot training results
epochs=20
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of two-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of two-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
#Plot actual vs. test and validation prediction
shift = 14
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.title(f"Predicted versus actual JHU_cases (two-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction result of the two-layer model, using time steps, looks very good as can be seen in the above graph. This graph compares the actual NY state observations versus the predicted data obtained applying the model on train data, in blue, and never test train data, in green. The predicted graph closely follows the actual data and predicts with reasonable accuracy.

In [ ]:
# Forecasting 14 days beyond our dataset
n_future=14
forecast = model_lstm_shift.predict(xtrain[-n_future:])

Stretching the results obtained with the two-layer model, we predicted two weeks beyond the dataset. A two-week forecast period is well supported because we used a look-back period with a window of 14 days.

In [ ]:
#Plot actual vs. test and validation prediction
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.plot(range(len(y_train)+len(y_test),len(y_train)+len(y_test)+len(forecast)),forecast,color='r',alpha=.8,lw=2,label="Forecast")
plt.title(f"Predicted versus actual JHU_cases (two-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The above graph for New York data is similar to the previous actual vs. predicted. This time we included a two-week forecast, red color, that shows a reduction in the number of COVID cases. The forecasted data is very close to what actually happened in NY at that time.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_7=round(mean_MAE[0],4)
MAE_val_7=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_7)
print("The validation MAE of the model is", MAE_val_7)
The MAE of the model is 0.0353
The validation MAE of the model is 0.0493

d) Multi-layer LSTM model Using Time Steps¶

Return to contents

As a logical forward step in trying to improve the previous results, we tested the same time steps approach on a multi-level LSTM model. [9]

In [ ]:
#Using info New York
info_sel_state = sel_state("New York")
info_train_test=train_test("New York")

#x and y train and test variables
x_train=info_train_test[4]
x_train.drop(["date"], axis=1, inplace=True)
y_train=info_train_test[1]
x_test=info_train_test[5]
x_test.drop(["date"], axis=1, inplace=True)
y_test=info_train_test[3]

# Datasets shape
print("\nThe shape of xtrain is", x_train.shape)
print("The shape of ytrain is", y_train.shape)

print("\nThe shape of xtest is", x_test.shape)
print("The shape of ytest is", y_test.shape)
The shape of xtrain is (558, 499)
The shape of ytrain is (558,)

The shape of xtest is (186, 499)
The shape of ytest is (186,)
In [ ]:
# Generating Time Step Series
xtrain,ytrain = format_series(1,14,x_train) 
xtest,ytest = format_series(1,14,x_test) 

# Checking the test and train data shapes
print("\nThe shape of xtrain is", xtrain.shape)
print("The shape of ytrain is", ytrain.shape)

print("\nThe shape of xtest is", xtest.shape)
print("The shape of ytest is", ytest.shape)
The shape of xtrain is (544, 14, 498)
The shape of ytrain is (544, 1)

The shape of xtest is (172, 14, 498)
The shape of ytest is (172, 1)
In [ ]:
tf.keras.backend.clear_session()

n_units=64

# Model architecture with 4 LSTM layers
model_lstm_input = tf.keras.Input(shape=(xtrain.shape[1], xtrain.shape[2]))
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_lstm_input)
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_hidden)
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_hidden)
model_hidden=tf.keras.layers.LSTM(units = n_units//2, activation='relu', return_sequences=False)(model_hidden)
model_hidden=tf.keras.layers.Dropout(0.2)(model_hidden)
model_lstm_output=tf.keras.layers.Dense(units = 1)(model_hidden)

model_lstm_shift = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm_time_steps")

#Print the model architecture
print(model_lstm_shift.summary())

#Compile model
model_lstm_shift.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Fit model
model_lstm_shift_results=model_lstm_shift.fit(xtrain, ytrain, 
                                               validation_split=0.2, 
                                               epochs = 50,
                                               batch_size = 32)

#Predict with model
model_lstm_shift_predict=model_lstm_shift.predict(xtest)
Model: "model_lstm_time_steps"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 64)            144128    
                                                                 
 lstm_1 (LSTM)               (None, 14, 64)            33024     
                                                                 
 lstm_2 (LSTM)               (None, 14, 64)            33024     
                                                                 
 lstm_3 (LSTM)               (None, 32)                12416     
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense (Dense)               (None, 1)                 33        
                                                                 
=================================================================
Total params: 222,625
Trainable params: 222,625
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
14/14 [==============================] - 7s 97ms/step - loss: 38260176.0000 - mae: 4093.9268 - val_loss: 10088403.0000 - val_mae: 2235.2007
Epoch 2/50
14/14 [==============================] - 1s 47ms/step - loss: 26615236.0000 - mae: 3610.0017 - val_loss: 14852836.0000 - val_mae: 3550.7310
Epoch 3/50
14/14 [==============================] - 1s 51ms/step - loss: 20209262.0000 - mae: 3479.1252 - val_loss: 23370210.0000 - val_mae: 4385.4644
Epoch 4/50
14/14 [==============================] - 1s 47ms/step - loss: 18983208.0000 - mae: 3433.6384 - val_loss: 13503226.0000 - val_mae: 3404.5608
Epoch 5/50
14/14 [==============================] - 1s 48ms/step - loss: 17582030.0000 - mae: 3218.2012 - val_loss: 14992870.0000 - val_mae: 3569.0183
Epoch 6/50
14/14 [==============================] - 1s 48ms/step - loss: 17928366.0000 - mae: 3198.0312 - val_loss: 13957928.0000 - val_mae: 3469.9272
Epoch 7/50
14/14 [==============================] - 1s 49ms/step - loss: 10245148.0000 - mae: 2295.8137 - val_loss: 12289368.0000 - val_mae: 3287.1458
Epoch 8/50
14/14 [==============================] - 1s 48ms/step - loss: 6821453.0000 - mae: 1470.8976 - val_loss: 4150058.0000 - val_mae: 1617.7141
Epoch 9/50
14/14 [==============================] - 1s 46ms/step - loss: 6147211.0000 - mae: 1367.9806 - val_loss: 4609080.5000 - val_mae: 1901.9875
Epoch 10/50
14/14 [==============================] - 1s 48ms/step - loss: 5579849.0000 - mae: 1318.6981 - val_loss: 5509693.0000 - val_mae: 2163.4243
Epoch 11/50
14/14 [==============================] - 1s 50ms/step - loss: 5328524.5000 - mae: 1390.7135 - val_loss: 4634629.5000 - val_mae: 1943.6030
Epoch 12/50
14/14 [==============================] - 1s 49ms/step - loss: 4740514.5000 - mae: 1216.5298 - val_loss: 5096843.5000 - val_mae: 2029.6202
Epoch 13/50
14/14 [==============================] - 1s 50ms/step - loss: 7136721.0000 - mae: 1604.1820 - val_loss: 9786974.0000 - val_mae: 2874.0867
Epoch 14/50
14/14 [==============================] - 1s 47ms/step - loss: 12919749.0000 - mae: 2315.4888 - val_loss: 30196772.0000 - val_mae: 4763.6455
Epoch 15/50
14/14 [==============================] - 1s 46ms/step - loss: 14834348.0000 - mae: 2595.7129 - val_loss: 9373637.0000 - val_mae: 2838.5596
Epoch 16/50
14/14 [==============================] - 1s 49ms/step - loss: 19098640.0000 - mae: 2991.3137 - val_loss: 10141463.0000 - val_mae: 2948.0474
Epoch 17/50
14/14 [==============================] - 1s 48ms/step - loss: 17485984.0000 - mae: 3218.7532 - val_loss: 17431428.0000 - val_mae: 3843.1360
Epoch 18/50
14/14 [==============================] - 1s 46ms/step - loss: 14997593.0000 - mae: 3106.2969 - val_loss: 16050512.0000 - val_mae: 3698.2439
Epoch 19/50
14/14 [==============================] - 1s 48ms/step - loss: 15226133.0000 - mae: 2939.2229 - val_loss: 17732448.0000 - val_mae: 3887.6272
Epoch 20/50
14/14 [==============================] - 1s 48ms/step - loss: 13328094.0000 - mae: 2732.2529 - val_loss: 10496483.0000 - val_mae: 2178.5332
Epoch 21/50
14/14 [==============================] - 1s 47ms/step - loss: 11334106.0000 - mae: 2347.7437 - val_loss: 19704482.0000 - val_mae: 3587.8381
Epoch 22/50
14/14 [==============================] - 1s 49ms/step - loss: 8779215.0000 - mae: 1955.3296 - val_loss: 4453819.0000 - val_mae: 1782.2784
Epoch 23/50
14/14 [==============================] - 1s 48ms/step - loss: 9022253.0000 - mae: 1811.5559 - val_loss: 7766824.0000 - val_mae: 2069.0439
Epoch 24/50
14/14 [==============================] - 1s 47ms/step - loss: 18458076.0000 - mae: 2701.0137 - val_loss: 7999447.0000 - val_mae: 2228.7947
Epoch 25/50
14/14 [==============================] - 1s 48ms/step - loss: 13085408.0000 - mae: 2429.9890 - val_loss: 7980085.5000 - val_mae: 2627.2168
Epoch 26/50
14/14 [==============================] - 1s 47ms/step - loss: 10908625.0000 - mae: 2378.9932 - val_loss: 8816103.0000 - val_mae: 2794.2283
Epoch 27/50
14/14 [==============================] - 1s 47ms/step - loss: 9974600.0000 - mae: 2252.8386 - val_loss: 5404559.5000 - val_mae: 2084.3638
Epoch 28/50
14/14 [==============================] - 1s 50ms/step - loss: 6606811.5000 - mae: 1645.1766 - val_loss: 4724673.0000 - val_mae: 1978.2881
Epoch 29/50
14/14 [==============================] - 1s 48ms/step - loss: 7042011.0000 - mae: 1496.2184 - val_loss: 5747079.0000 - val_mae: 2234.5862
Epoch 30/50
14/14 [==============================] - 1s 50ms/step - loss: 6577869.0000 - mae: 1479.6544 - val_loss: 8019064.0000 - val_mae: 2548.5120
Epoch 31/50
14/14 [==============================] - 1s 48ms/step - loss: 6488319.0000 - mae: 1525.8337 - val_loss: 7231746.5000 - val_mae: 2348.6799
Epoch 32/50
14/14 [==============================] - 1s 49ms/step - loss: 6202960.5000 - mae: 1407.8066 - val_loss: 1694594.7500 - val_mae: 930.1036
Epoch 33/50
14/14 [==============================] - 1s 47ms/step - loss: 5828259.5000 - mae: 1390.6769 - val_loss: 8115519.0000 - val_mae: 2584.6951
Epoch 34/50
14/14 [==============================] - 1s 54ms/step - loss: 6215967.0000 - mae: 1363.8821 - val_loss: 7565009.0000 - val_mae: 2365.7307
Epoch 35/50
14/14 [==============================] - 1s 51ms/step - loss: 5783503.5000 - mae: 1347.8904 - val_loss: 10721657.0000 - val_mae: 3080.4885
Epoch 36/50
14/14 [==============================] - 1s 47ms/step - loss: 5836080.5000 - mae: 1466.5253 - val_loss: 2569827.0000 - val_mae: 1125.6656
Epoch 37/50
14/14 [==============================] - 1s 47ms/step - loss: 6593006.0000 - mae: 1447.4344 - val_loss: 3288794.5000 - val_mae: 1412.8909
Epoch 38/50
14/14 [==============================] - 1s 48ms/step - loss: 6870197.5000 - mae: 1331.1248 - val_loss: 7056457.5000 - val_mae: 2281.2322
Epoch 39/50
14/14 [==============================] - 1s 46ms/step - loss: 5138612.0000 - mae: 1335.5691 - val_loss: 4156475.5000 - val_mae: 1784.1935
Epoch 40/50
14/14 [==============================] - 1s 48ms/step - loss: 4612957.5000 - mae: 1270.2135 - val_loss: 2112119.2500 - val_mae: 1135.6956
Epoch 41/50
14/14 [==============================] - 1s 50ms/step - loss: 6296884.5000 - mae: 1379.5267 - val_loss: 6763894.0000 - val_mae: 2244.4678
Epoch 42/50
14/14 [==============================] - 1s 47ms/step - loss: 6525176.5000 - mae: 1470.7224 - val_loss: 14936518.0000 - val_mae: 3604.1646
Epoch 43/50
14/14 [==============================] - 1s 49ms/step - loss: 8922662.0000 - mae: 2276.5759 - val_loss: 6590064.5000 - val_mae: 2269.4119
Epoch 44/50
14/14 [==============================] - 1s 49ms/step - loss: 6505676.0000 - mae: 1682.4139 - val_loss: 5616199.5000 - val_mae: 1946.0612
Epoch 45/50
14/14 [==============================] - 1s 47ms/step - loss: 6353706.0000 - mae: 1590.4766 - val_loss: 4748698.0000 - val_mae: 2017.6055
Epoch 46/50
14/14 [==============================] - 1s 48ms/step - loss: 5611137.5000 - mae: 1374.6112 - val_loss: 6451386.0000 - val_mae: 2314.7810
Epoch 47/50
14/14 [==============================] - 1s 49ms/step - loss: 4910579.0000 - mae: 1291.1406 - val_loss: 4041683.0000 - val_mae: 1657.8103
Epoch 48/50
14/14 [==============================] - 1s 47ms/step - loss: 9604768.0000 - mae: 2004.9813 - val_loss: 7787944.0000 - val_mae: 2489.7725
Epoch 49/50
14/14 [==============================] - 1s 48ms/step - loss: 16501388.0000 - mae: 2753.7888 - val_loss: 10653388.0000 - val_mae: 3045.8367
Epoch 50/50
14/14 [==============================] - 1s 46ms/step - loss: 12748582.0000 - mae: 2423.3013 - val_loss: 8023697.5000 - val_mae: 2547.6824
In [ ]:
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
#Plot actual vs. test and validation prediction
shift = 14
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.9)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction result of the multi-layer model, using time steps, looks very good as can be seen in the above graph. This graph compares the actual NY state observations versus the predicted data obtained applying the model on train data, in blue, and never seen test data, in green. The predicted graph closely follows the actual data and predicts with reasonable accuracy.

In [ ]:
# Forecasting 14 days beyond our dataset
n_future=14
forecast = model_lstm_shift.predict(xtrain[-n_future:])

Stretching the results obtained with the two-layer model, we predicted two weeks beyond the dataset. A two-week forecast period is well supported because we used a look-back period with a window of 14 days.

In [ ]:
#Plot actual vs. test and validation prediction
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.plot(range(len(y_train)+len(y_test),len(y_train)+len(y_test)+len(forecast)),forecast,color='r',alpha=.8,lw=2,label="Forecast")
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The above graph for New York data is similar to the previous actual vs. predicted. This time we included a two-week forecast, red color, that shows a reduction in the number of COVID cases. The forecasted data is very close to what actually happened in NY at that time.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_8=round(mean_MAE[0],4)
MAE_val_8=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_8)
print("The validation MAE of the model is", MAE_val_8)
The MAE of the model is 0.0353
The validation MAE of the model is 0.0493

4.7.2 California¶

Return to contents

a) One layer LSTM model¶

Return to contents

As we did with NY data, for California, we started with a one-layer model, applying our train and test datasets as time series. Below is the naive LSTM model of the NY State, using one layer as a baseline model.

In [ ]:
tf.keras.backend.clear_session()

# Using info New York
info_sel_state = sel_state("California")
info_train_test=train_test("California")

# Scaler
scaler_y = MinMaxScaler(feature_range=(0, 1))

# minmax scaler for y, scaling between 0 and 1
ytrain = info_train_test[1].to_numpy().reshape(-1,1)
ytrain = scaler_y.fit_transform(ytrain)
ytest  = info_train_test[3].to_numpy().reshape(-1,1)
ytest  = scaler_y.fit_transform(ytest)

#Call x variables
xtrain=info_train_test[4]
xtrain.drop(["date", "JHU_cases"], axis=1, inplace=True)
xtest=info_train_test[5].copy()
xtest.drop(["date", "JHU_cases"], axis=1, inplace=True)

# time series generator uses 14 day block to predict the next day 
train_gen = TimeseriesGenerator(xtrain.to_numpy(), ytrain,
                               length=14, sampling_rate=1,  
                                batch_size = 558)

test_gen = TimeseriesGenerator(xtest.to_numpy(), ytest,
                                length=14, sampling_rate=1, 
                               batch_size = 186)

# Below we are creating x and y train and test variable from the generators
x_train, y_train = train_gen[0]
# x_val, y_val = val_gen[0]
x_test, y_test = test_gen[0]

print('x train shape is :', x_train.shape)
print('y train shape is :', y_train.shape)
print('x test shape is :' , x_test.shape)
print('y test shape is :' , y_test.shape)
x train shape is : (544, 14, 498)
y train shape is : (544, 1)
x test shape is : (172, 14, 498)
y test shape is : (172, 1)
In [ ]:
# input dimension below
input_dim = x_train.shape[1:]
n_units = 100

#Create model
model_lstm_input = tf.keras.Input(shape=input_dim)
model_hidden=tf.keras.layers.LSTM(units = n_units)(model_lstm_input)
model_lstm_output=tf.keras.layers.Dense(units = 1, activation="linear")(model_hidden)
model_lstm = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm")


#Print the model architecture
print(model_lstm.summary())

#Compile model
model_lstm.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_lstm.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_lstm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 100)               239600    
                                                                 
 dense (Dense)               (None, 1)                 101       
                                                                 
=================================================================
Total params: 239,701
Trainable params: 239,701
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 8s 13ms/step - loss: 0.0242 - mae: 0.0858 - val_loss: 0.0010 - val_mae: 0.0243
Epoch 2/50
489/489 [==============================] - 6s 13ms/step - loss: 0.0050 - mae: 0.0500 - val_loss: 0.0098 - val_mae: 0.0934
Epoch 3/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0049 - mae: 0.0460 - val_loss: 0.0027 - val_mae: 0.0447
Epoch 4/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0036 - mae: 0.0408 - val_loss: 0.0038 - val_mae: 0.0552
Epoch 5/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0031 - mae: 0.0382 - val_loss: 0.0018 - val_mae: 0.0356
Epoch 6/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0030 - mae: 0.0366 - val_loss: 0.0012 - val_mae: 0.0279
Epoch 7/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0034 - mae: 0.0385 - val_loss: 8.8367e-04 - val_mae: 0.0202
Epoch 8/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0028 - mae: 0.0348 - val_loss: 0.0064 - val_mae: 0.0759
Epoch 9/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0028 - mae: 0.0350 - val_loss: 0.0012 - val_mae: 0.0273
Epoch 10/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0024 - mae: 0.0326 - val_loss: 7.4313e-04 - val_mae: 0.0190
Epoch 11/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0022 - mae: 0.0310 - val_loss: 8.7245e-04 - val_mae: 0.0218
Epoch 12/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0022 - mae: 0.0310 - val_loss: 0.0029 - val_mae: 0.0498
Epoch 13/50
489/489 [==============================] - 7s 14ms/step - loss: 0.0029 - mae: 0.0348 - val_loss: 4.6095e-04 - val_mae: 0.0116
Epoch 14/50
489/489 [==============================] - 6s 13ms/step - loss: 0.0020 - mae: 0.0270 - val_loss: 6.3583e-04 - val_mae: 0.0155
Epoch 15/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0017 - mae: 0.0253 - val_loss: 6.7369e-04 - val_mae: 0.0188
Epoch 16/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0014 - mae: 0.0245 - val_loss: 5.4225e-04 - val_mae: 0.0137
Epoch 17/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0015 - mae: 0.0243 - val_loss: 5.4643e-04 - val_mae: 0.0142
Epoch 18/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0014 - mae: 0.0233 - val_loss: 4.9929e-04 - val_mae: 0.0124
Epoch 19/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0013 - mae: 0.0229 - val_loss: 5.1739e-04 - val_mae: 0.0126
Epoch 20/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0015 - mae: 0.0257 - val_loss: 5.7455e-04 - val_mae: 0.0144
Epoch 21/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0017 - mae: 0.0268 - val_loss: 4.9246e-04 - val_mae: 0.0123
Epoch 22/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0011 - mae: 0.0212 - val_loss: 9.1582e-04 - val_mae: 0.0242
Epoch 23/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0011 - mae: 0.0216 - val_loss: 9.7890e-04 - val_mae: 0.0259
Epoch 24/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0011 - mae: 0.0208 - val_loss: 0.0015 - val_mae: 0.0356
Epoch 25/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0013 - mae: 0.0235 - val_loss: 5.9623e-04 - val_mae: 0.0166
Epoch 26/50
489/489 [==============================] - 6s 12ms/step - loss: 9.6111e-04 - mae: 0.0200 - val_loss: 6.7288e-04 - val_mae: 0.0186
Epoch 27/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0012 - mae: 0.0222 - val_loss: 5.5359e-04 - val_mae: 0.0148
Epoch 28/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0011 - mae: 0.0202 - val_loss: 0.0013 - val_mae: 0.0319
Epoch 29/50
489/489 [==============================] - 6s 12ms/step - loss: 0.0012 - mae: 0.0238 - val_loss: 4.8396e-04 - val_mae: 0.0124
Epoch 30/50
489/489 [==============================] - 6s 12ms/step - loss: 9.4039e-04 - mae: 0.0200 - val_loss: 5.7284e-04 - val_mae: 0.0139
Epoch 31/50
489/489 [==============================] - 6s 13ms/step - loss: 9.5435e-04 - mae: 0.0199 - val_loss: 5.1477e-04 - val_mae: 0.0140
Epoch 32/50
489/489 [==============================] - 6s 12ms/step - loss: 8.7772e-04 - mae: 0.0183 - val_loss: 6.2283e-04 - val_mae: 0.0177
Epoch 33/50
489/489 [==============================] - 6s 13ms/step - loss: 7.9906e-04 - mae: 0.0179 - val_loss: 6.0410e-04 - val_mae: 0.0151
Epoch 34/50
489/489 [==============================] - 6s 13ms/step - loss: 8.1934e-04 - mae: 0.0190 - val_loss: 7.8557e-04 - val_mae: 0.0206
Epoch 35/50
489/489 [==============================] - 6s 12ms/step - loss: 7.7926e-04 - mae: 0.0190 - val_loss: 4.6500e-04 - val_mae: 0.0113
Epoch 36/50
489/489 [==============================] - 6s 12ms/step - loss: 9.5465e-04 - mae: 0.0197 - val_loss: 5.8315e-04 - val_mae: 0.0151
Epoch 37/50
489/489 [==============================] - 6s 12ms/step - loss: 6.1789e-04 - mae: 0.0167 - val_loss: 6.0421e-04 - val_mae: 0.0159
Epoch 38/50
489/489 [==============================] - 6s 12ms/step - loss: 9.1427e-04 - mae: 0.0203 - val_loss: 4.8548e-04 - val_mae: 0.0125
Epoch 39/50
489/489 [==============================] - 6s 13ms/step - loss: 6.2995e-04 - mae: 0.0165 - val_loss: 5.6468e-04 - val_mae: 0.0159
Epoch 40/50
489/489 [==============================] - 6s 12ms/step - loss: 7.2378e-04 - mae: 0.0175 - val_loss: 5.7095e-04 - val_mae: 0.0136
Epoch 41/50
489/489 [==============================] - 6s 13ms/step - loss: 5.9768e-04 - mae: 0.0166 - val_loss: 5.2024e-04 - val_mae: 0.0121
Epoch 42/50
489/489 [==============================] - 6s 13ms/step - loss: 8.5951e-04 - mae: 0.0194 - val_loss: 5.4544e-04 - val_mae: 0.0136
Epoch 43/50
489/489 [==============================] - 6s 13ms/step - loss: 5.4284e-04 - mae: 0.0152 - val_loss: 5.4606e-04 - val_mae: 0.0138
Epoch 44/50
489/489 [==============================] - 6s 13ms/step - loss: 7.0342e-04 - mae: 0.0172 - val_loss: 5.5591e-04 - val_mae: 0.0133
Epoch 45/50
489/489 [==============================] - 6s 13ms/step - loss: 6.6437e-04 - mae: 0.0174 - val_loss: 5.5653e-04 - val_mae: 0.0146
Epoch 46/50
489/489 [==============================] - 6s 13ms/step - loss: 5.7819e-04 - mae: 0.0161 - val_loss: 5.0407e-04 - val_mae: 0.0117
Epoch 47/50
489/489 [==============================] - 7s 14ms/step - loss: 5.2939e-04 - mae: 0.0152 - val_loss: 5.5468e-04 - val_mae: 0.0129
Epoch 48/50
489/489 [==============================] - 6s 12ms/step - loss: 6.8378e-04 - mae: 0.0172 - val_loss: 8.4082e-04 - val_mae: 0.0219
Epoch 49/50
489/489 [==============================] - 6s 13ms/step - loss: 4.9727e-04 - mae: 0.0155 - val_loss: 5.7204e-04 - val_mae: 0.0143
Epoch 50/50
489/489 [==============================] - 6s 13ms/step - loss: 4.0654e-04 - mae: 0.0136 - val_loss: 5.0659e-04 - val_mae: 0.0120
In [ ]:
# Plotting the MSE and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of one-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of one-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')

axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
lstm_prediction = scaler_y.inverse_transform(model_lstm.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), lstm_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (one-layer LSTM model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The result of this baseline, one-layer model again looks promising but not good enough.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_9=round(mean_MAE[0],4)
MAE_val_9=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_9)
print("The validation MAE of the model is", MAE_val_9)
The MAE of the model is 0.0255
The validation MAE of the model is 0.0221

b) Multi-layer LSTM model¶

Return to contents

Then we tested adding more layers. Below is the multi-layer LSTM model of the State of California, using time series.

In [ ]:
### Multilayer LSTM model
#Design following: Iberoamerican Journal of Medicine
tf.keras.backend.clear_session()

# input dimension below
input_dim = x_train.shape[1:]
n_units= 50

#Create model
model_lstm_input = tf.keras.Input(shape=input_dim)
model_hidden1=tf.keras.layers.LSTM(units = n_units, return_sequences=True)(model_lstm_input)
model_dropout1=tf.keras.layers.Dropout(rate=0.2)(model_hidden1)
model_hidden2=tf.keras.layers.LSTM(units = n_units, return_sequences=True)(model_dropout1)
model_dropout2=tf.keras.layers.Dropout(rate=0.2)(model_hidden2)
model_hidden3=tf.keras.layers.LSTM(units = n_units, return_sequences=False)(model_dropout2)
model_dropout3=tf.keras.layers.Dropout(rate=0.2)(model_hidden3)
model_lstm_output=tf.keras.layers.Dense(units = 1, activation="linear")(model_dropout3)
model_lstm = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm")

#Print the model architecture
print(model_lstm.summary())

#Compile model
model_lstm.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Convert x to tensor
x_train_tf = tf.convert_to_tensor(x_train, np.float32)

#Fit model
history = model_lstm.fit(x_train_tf, y_train, epochs = 50, validation_split=0.1, batch_size = 1, verbose=1)
Model: "model_lstm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 50)            109800    
                                                                 
 dropout (Dropout)           (None, 14, 50)            0         
                                                                 
 lstm_1 (LSTM)               (None, 14, 50)            20200     
                                                                 
 dropout_1 (Dropout)         (None, 14, 50)            0         
                                                                 
 lstm_2 (LSTM)               (None, 50)                20200     
                                                                 
 dropout_2 (Dropout)         (None, 50)                0         
                                                                 
 dense (Dense)               (None, 1)                 51        
                                                                 
=================================================================
Total params: 150,251
Trainable params: 150,251
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
489/489 [==============================] - 15s 19ms/step - loss: 0.0199 - mae: 0.0914 - val_loss: 6.4720e-04 - val_mae: 0.0210
Epoch 2/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0081 - mae: 0.0547 - val_loss: 0.0063 - val_mae: 0.0771
Epoch 3/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0052 - mae: 0.0450 - val_loss: 3.9298e-04 - val_mae: 0.0107
Epoch 4/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0064 - mae: 0.0493 - val_loss: 0.0034 - val_mae: 0.0558
Epoch 5/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0046 - mae: 0.0411 - val_loss: 0.0014 - val_mae: 0.0335
Epoch 6/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0045 - mae: 0.0416 - val_loss: 7.6339e-04 - val_mae: 0.0238
Epoch 7/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0039 - mae: 0.0345 - val_loss: 3.9490e-04 - val_mae: 0.0121
Epoch 8/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0054 - mae: 0.0439 - val_loss: 0.0015 - val_mae: 0.0365
Epoch 9/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0043 - mae: 0.0393 - val_loss: 0.0017 - val_mae: 0.0385
Epoch 10/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0037 - mae: 0.0362 - val_loss: 4.0640e-04 - val_mae: 0.0121
Epoch 11/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0036 - mae: 0.0348 - val_loss: 8.9209e-04 - val_mae: 0.0266
Epoch 12/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0038 - mae: 0.0366 - val_loss: 5.6357e-04 - val_mae: 0.0185
Epoch 13/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0039 - mae: 0.0359 - val_loss: 3.8485e-04 - val_mae: 0.0106
Epoch 14/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0036 - mae: 0.0349 - val_loss: 5.5303e-04 - val_mae: 0.0178
Epoch 15/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0040 - mae: 0.0361 - val_loss: 4.1175e-04 - val_mae: 0.0125
Epoch 16/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0033 - mae: 0.0339 - val_loss: 4.2312e-04 - val_mae: 0.0115
Epoch 17/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0034 - mae: 0.0318 - val_loss: 5.6858e-04 - val_mae: 0.0185
Epoch 18/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0036 - mae: 0.0349 - val_loss: 9.5436e-04 - val_mae: 0.0275
Epoch 19/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0039 - mae: 0.0389 - val_loss: 0.0014 - val_mae: 0.0349
Epoch 20/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0040 - mae: 0.0368 - val_loss: 0.0029 - val_mae: 0.0516
Epoch 21/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0032 - mae: 0.0325 - val_loss: 0.0038 - val_mae: 0.0589
Epoch 22/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0024 - mae: 0.0314 - val_loss: 5.1646e-04 - val_mae: 0.0168
Epoch 23/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0035 - mae: 0.0351 - val_loss: 3.9534e-04 - val_mae: 0.0106
Epoch 24/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0031 - mae: 0.0341 - val_loss: 6.8110e-04 - val_mae: 0.0219
Epoch 25/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0037 - mae: 0.0367 - val_loss: 4.9154e-04 - val_mae: 0.0154
Epoch 26/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0028 - mae: 0.0303 - val_loss: 0.0027 - val_mae: 0.0493
Epoch 27/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0029 - mae: 0.0320 - val_loss: 4.4094e-04 - val_mae: 0.0134
Epoch 28/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0030 - mae: 0.0331 - val_loss: 4.0304e-04 - val_mae: 0.0113
Epoch 29/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0024 - mae: 0.0285 - val_loss: 5.7885e-04 - val_mae: 0.0189
Epoch 30/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0032 - mae: 0.0340 - val_loss: 6.9082e-04 - val_mae: 0.0214
Epoch 31/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0024 - mae: 0.0291 - val_loss: 8.4675e-04 - val_mae: 0.0257
Epoch 32/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0026 - mae: 0.0308 - val_loss: 5.6727e-04 - val_mae: 0.0184
Epoch 33/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0032 - mae: 0.0320 - val_loss: 6.8826e-04 - val_mae: 0.0219
Epoch 34/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0028 - mae: 0.0312 - val_loss: 6.6016e-04 - val_mae: 0.0207
Epoch 35/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0026 - mae: 0.0299 - val_loss: 5.3233e-04 - val_mae: 0.0169
Epoch 36/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0025 - mae: 0.0288 - val_loss: 9.1059e-04 - val_mae: 0.0269
Epoch 37/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0029 - mae: 0.0289 - val_loss: 4.2710e-04 - val_mae: 0.0124
Epoch 38/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0023 - mae: 0.0280 - val_loss: 8.7043e-04 - val_mae: 0.0261
Epoch 39/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0027 - mae: 0.0301 - val_loss: 4.8235e-04 - val_mae: 0.0128
Epoch 40/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0028 - mae: 0.0300 - val_loss: 6.9948e-04 - val_mae: 0.0222
Epoch 41/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0028 - mae: 0.0303 - val_loss: 5.2715e-04 - val_mae: 0.0170
Epoch 42/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0020 - mae: 0.0283 - val_loss: 4.7490e-04 - val_mae: 0.0144
Epoch 43/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0027 - mae: 0.0296 - val_loss: 0.0012 - val_mae: 0.0325
Epoch 44/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0022 - mae: 0.0298 - val_loss: 4.2577e-04 - val_mae: 0.0107
Epoch 45/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0021 - mae: 0.0268 - val_loss: 7.0891e-04 - val_mae: 0.0200
Epoch 46/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0026 - mae: 0.0300 - val_loss: 5.5670e-04 - val_mae: 0.0180
Epoch 47/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0028 - mae: 0.0296 - val_loss: 0.0021 - val_mae: 0.0436
Epoch 48/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0022 - mae: 0.0287 - val_loss: 4.5472e-04 - val_mae: 0.0142
Epoch 49/50
489/489 [==============================] - 8s 16ms/step - loss: 0.0023 - mae: 0.0288 - val_loss: 4.2030e-04 - val_mae: 0.0104
Epoch 50/50
489/489 [==============================] - 8s 17ms/step - loss: 0.0020 - mae: 0.0264 - val_loss: 4.0863e-04 - val_mae: 0.0100
In [ ]:
# Plotting the MSE and MAE graphs
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer LSTM model for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), history.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
# Predicting the y variable

# we are using inverse transform below to convert the minmax scaled y to the original cases
lstm_prediction = scaler_y.inverse_transform(model_lstm.predict(x_test))


#Transform date to string    
xtest_df=info_train_test[5]
dates=xtest_df["date"]
list_dates=[]
for i in dates:
    date=i.strftime("%d/%m/%Y")
    list_dates.append(date)
    
#Plot
plt.figure(figsize=(12,6))
plt.plot(range(len(y_test)), lstm_prediction, label="Predicted")
plt.plot(range(len(y_test)),list(scaler_y.inverse_transform(y_test)), label="Actual")
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM model) for {file_names[info_sel_state[0]]} using test data")
plt.xticks(range(len(ytest))[::4], list_dates[::4], rotation ="vertical", fontsize= 8)
plt.xlabel("Date")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction with the multi-layer model does not show much improvement over the previous one-layer model. We tried different values of the hyperparameters: number of units, optimizer, and longer training cycle. We also applied cross-validation without success.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_10=round(mean_MAE[0],4)
MAE_val_10=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_10)
print("The validation MAE of the model is", MAE_val_10)
The MAE of the model is 0.0349
The validation MAE of the model is 0.0237

c) Two-layer LSTM model Using Time Steps¶

Return to contents \

For the following California models, we used time steps applying the use of a sliding window to the data, as we did with NY data. To feed the LSTM models, we pre-processed our train and test datasets as time step series with a look-back of 14 days and a horizon of one day in the future. Below is a two-layer LSTM using time steps applied to the State of California data. [8]

In [ ]:
#Using info New York
info_sel_state = sel_state("California")
info_train_test=train_test("California")

#x and y train and test variables
x_train=info_train_test[4]
x_train.drop(["date"], axis=1, inplace=True)
y_train=info_train_test[1]
x_test=info_train_test[5]
x_test.drop(["date"], axis=1, inplace=True)
y_test=info_train_test[3]

# Datasets shape
print("\nThe shape of xtrain is", x_train.shape)
print("The shape of ytrain is", y_train.shape)

print("\nThe shape of xtest is", x_test.shape)
print("The shape of ytest is", y_test.shape)
The shape of xtrain is (558, 499)
The shape of ytrain is (558,)

The shape of xtest is (186, 499)
The shape of ytest is (186,)
In [ ]:
# Function to create datasets for LSTM as time steps
def format_series(n_future,n_past,df):
  # Creates series of multi-step on the dataset based on n_past periods
  # and adds the corresponding predicted values based on n_future periods
    trainX=[]
    trainY=[]
    for i in range(n_past,len(df)-n_future+1):
        trainX.append(df.iloc[i-n_past:i,1:df.shape[1]])#All features except y
        trainY.append(df.iloc[i+n_future-1:i+n_future,0])#y

    return np.array(trainX),np.array(trainY)  
In [ ]:
# Generating Time Step Series
xtrain,ytrain = format_series(1,14,x_train) 
xtest,ytest = format_series(1,14,x_test) 

# Checking the test and train data shapes
print("\nThe shape of xtrain is", xtrain.shape)
print("The shape of ytrain is", ytrain.shape)

print("\nThe shape of xtest is", xtest.shape)
print("The shape of ytest is", ytest.shape)
The shape of xtrain is (544, 14, 498)
The shape of ytrain is (544, 1)

The shape of xtest is (172, 14, 498)
The shape of ytest is (172, 1)
In [ ]:
tf.keras.backend.clear_session()

n_units=64

# Model architecture with 2 LSTM layers
model_lstm_input = tf.keras.Input(shape=(xtrain.shape[1], xtrain.shape[2]))
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_lstm_input)
model_hidden=tf.keras.layers.LSTM(units = n_units//2, activation='relu', return_sequences=False)(model_hidden)
model_hidden=tf.keras.layers.Dropout(0.2)(model_hidden)
model_lstm_output=tf.keras.layers.Dense(units = 1)(model_hidden)

model_lstm_shift = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm_time_steps")

#Print the model architecture
print(model_lstm_shift.summary())

#Compile model
model_lstm_shift.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Fit model
model_lstm_shift_results=model_lstm_shift.fit(xtrain, ytrain, 
                                               validation_split=0.2, 
                                               epochs = 50,
                                               batch_size = 32)

#Predict with model
model_lstm_shift_predict=model_lstm_shift.predict(xtest)
Model: "model_lstm_time_steps"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 64)            144128    
                                                                 
 lstm_1 (LSTM)               (None, 32)                12416     
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense (Dense)               (None, 1)                 33        
                                                                 
=================================================================
Total params: 156,577
Trainable params: 156,577
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/50
14/14 [==============================] - 4s 101ms/step - loss: 211965520.0000 - mae: 8498.7295 - val_loss: 3273468.5000 - val_mae: 1434.0105
Epoch 2/50
14/14 [==============================] - 0s 31ms/step - loss: 181791840.0000 - mae: 7388.6646 - val_loss: 30547176.0000 - val_mae: 5329.3003
Epoch 3/50
14/14 [==============================] - 0s 30ms/step - loss: 142289808.0000 - mae: 7866.9229 - val_loss: 46882304.0000 - val_mae: 6680.9351
Epoch 4/50
14/14 [==============================] - 0s 31ms/step - loss: 134673168.0000 - mae: 7133.2437 - val_loss: 52617780.0000 - val_mae: 7106.0850
Epoch 5/50
14/14 [==============================] - 0s 33ms/step - loss: 126756376.0000 - mae: 7416.6733 - val_loss: 65704912.0000 - val_mae: 7975.9336
Epoch 6/50
14/14 [==============================] - 0s 31ms/step - loss: 69502040.0000 - mae: 5036.1968 - val_loss: 15938108.0000 - val_mae: 3839.4763
Epoch 7/50
14/14 [==============================] - 0s 32ms/step - loss: 35582600.0000 - mae: 3916.6445 - val_loss: 3540142.5000 - val_mae: 1704.6936
Epoch 8/50
14/14 [==============================] - 0s 31ms/step - loss: 90303768.0000 - mae: 5564.8311 - val_loss: 1558467.2500 - val_mae: 782.2410
Epoch 9/50
14/14 [==============================] - 0s 32ms/step - loss: 86665392.0000 - mae: 5239.8696 - val_loss: 2893912.0000 - val_mae: 1113.5511
Epoch 10/50
14/14 [==============================] - 0s 33ms/step - loss: 63810940.0000 - mae: 4602.9082 - val_loss: 1358195.3750 - val_mae: 693.4846
Epoch 11/50
14/14 [==============================] - 0s 32ms/step - loss: 42374216.0000 - mae: 3933.2227 - val_loss: 3261521.5000 - val_mae: 1610.9432
Epoch 12/50
14/14 [==============================] - 0s 32ms/step - loss: 189152208.0000 - mae: 11158.7637 - val_loss: 53090464.0000 - val_mae: 7171.8066
Epoch 13/50
14/14 [==============================] - 0s 32ms/step - loss: 106676704.0000 - mae: 5913.0352 - val_loss: 1287265.5000 - val_mae: 740.5115
Epoch 14/50
14/14 [==============================] - 0s 32ms/step - loss: 120201680.0000 - mae: 6075.6958 - val_loss: 26331160.0000 - val_mae: 4986.2031
Epoch 15/50
14/14 [==============================] - 0s 32ms/step - loss: 106026856.0000 - mae: 6290.9160 - val_loss: 46089272.0000 - val_mae: 6662.6294
Epoch 16/50
14/14 [==============================] - 0s 32ms/step - loss: 100596432.0000 - mae: 6607.5190 - val_loss: 51968788.0000 - val_mae: 7091.3540
Epoch 17/50
14/14 [==============================] - 0s 29ms/step - loss: 75256664.0000 - mae: 5399.1514 - val_loss: 21754054.0000 - val_mae: 3889.6436
Epoch 18/50
14/14 [==============================] - 0s 33ms/step - loss: 71420848.0000 - mae: 5454.8418 - val_loss: 14686354.0000 - val_mae: 3266.7651
Epoch 19/50
14/14 [==============================] - 0s 31ms/step - loss: 56937136.0000 - mae: 3941.4146 - val_loss: 7949147.0000 - val_mae: 2160.6599
Epoch 20/50
14/14 [==============================] - 0s 34ms/step - loss: 44117388.0000 - mae: 3660.1829 - val_loss: 2335649.7500 - val_mae: 1077.7430
Epoch 21/50
14/14 [==============================] - 0s 33ms/step - loss: 23467996.0000 - mae: 2953.9502 - val_loss: 2447765.0000 - val_mae: 1100.4504
Epoch 22/50
14/14 [==============================] - 0s 31ms/step - loss: 30035370.0000 - mae: 3217.8740 - val_loss: 7498751.5000 - val_mae: 2234.8191
Epoch 23/50
14/14 [==============================] - 0s 32ms/step - loss: 27539032.0000 - mae: 2915.3477 - val_loss: 4217333.5000 - val_mae: 1643.8883
Epoch 24/50
14/14 [==============================] - 0s 32ms/step - loss: 25933814.0000 - mae: 2969.0520 - val_loss: 2760289.0000 - val_mae: 1160.8108
Epoch 25/50
14/14 [==============================] - 0s 32ms/step - loss: 26840592.0000 - mae: 2941.1904 - val_loss: 4074686.2500 - val_mae: 1537.7198
Epoch 26/50
14/14 [==============================] - 0s 31ms/step - loss: 23761842.0000 - mae: 2786.1062 - val_loss: 3727128.0000 - val_mae: 1432.5961
Epoch 27/50
14/14 [==============================] - 0s 32ms/step - loss: 27881032.0000 - mae: 3652.7236 - val_loss: 13375986.0000 - val_mae: 3484.3940
Epoch 28/50
14/14 [==============================] - 0s 33ms/step - loss: 24536220.0000 - mae: 3172.0112 - val_loss: 2735400.0000 - val_mae: 1352.9434
Epoch 29/50
14/14 [==============================] - 0s 35ms/step - loss: 103220744.0000 - mae: 6031.9404 - val_loss: 3086850.2500 - val_mae: 1369.8469
Epoch 30/50
14/14 [==============================] - 1s 36ms/step - loss: 62254900.0000 - mae: 4905.7773 - val_loss: 4037229.7500 - val_mae: 1697.7083
Epoch 31/50
14/14 [==============================] - 0s 32ms/step - loss: 151210784.0000 - mae: 7002.9478 - val_loss: 6006857.0000 - val_mae: 2227.8179
Epoch 32/50
14/14 [==============================] - 0s 31ms/step - loss: 126095808.0000 - mae: 6573.0713 - val_loss: 66913372.0000 - val_mae: 8053.4209
Epoch 33/50
14/14 [==============================] - 0s 32ms/step - loss: 70847520.0000 - mae: 5824.7715 - val_loss: 36431924.0000 - val_mae: 5884.0815
Epoch 34/50
14/14 [==============================] - 0s 31ms/step - loss: 33594176.0000 - mae: 4221.6182 - val_loss: 3612692.5000 - val_mae: 1564.3646
Epoch 35/50
14/14 [==============================] - 0s 31ms/step - loss: 32311990.0000 - mae: 3849.3606 - val_loss: 1553731.8750 - val_mae: 875.6566
Epoch 36/50
14/14 [==============================] - 0s 31ms/step - loss: 26592736.0000 - mae: 3325.4312 - val_loss: 8159780.5000 - val_mae: 2691.9521
Epoch 37/50
14/14 [==============================] - 0s 32ms/step - loss: 24768300.0000 - mae: 3255.9922 - val_loss: 3874033.0000 - val_mae: 1696.5494
Epoch 38/50
14/14 [==============================] - 0s 32ms/step - loss: 24107874.0000 - mae: 3265.0774 - val_loss: 6569189.5000 - val_mae: 2373.1074
Epoch 39/50
14/14 [==============================] - 0s 33ms/step - loss: 24603514.0000 - mae: 3211.7834 - val_loss: 3945404.0000 - val_mae: 1747.2844
Epoch 40/50
14/14 [==============================] - 0s 32ms/step - loss: 21820846.0000 - mae: 3013.8655 - val_loss: 4104106.7500 - val_mae: 1806.1414
Epoch 41/50
14/14 [==============================] - 0s 32ms/step - loss: 20923104.0000 - mae: 2963.4680 - val_loss: 2539174.2500 - val_mae: 1290.9636
Epoch 42/50
14/14 [==============================] - 0s 33ms/step - loss: 33846612.0000 - mae: 4031.4465 - val_loss: 4425227.0000 - val_mae: 1849.7350
Epoch 43/50
14/14 [==============================] - 0s 35ms/step - loss: 45077320.0000 - mae: 4643.5820 - val_loss: 1929012.8750 - val_mae: 1058.1930
Epoch 44/50
14/14 [==============================] - 0s 31ms/step - loss: 31429124.0000 - mae: 3890.6455 - val_loss: 6678005.5000 - val_mae: 2278.4661
Epoch 45/50
14/14 [==============================] - 0s 31ms/step - loss: 25508872.0000 - mae: 3629.1709 - val_loss: 7739110.0000 - val_mae: 2525.3257
Epoch 46/50
14/14 [==============================] - 0s 31ms/step - loss: 27984904.0000 - mae: 3352.6294 - val_loss: 3625606.7500 - val_mae: 1648.5385
Epoch 47/50
14/14 [==============================] - 0s 33ms/step - loss: 25264998.0000 - mae: 3408.8630 - val_loss: 5586283.0000 - val_mae: 2151.0068
Epoch 48/50
14/14 [==============================] - 0s 32ms/step - loss: 26763844.0000 - mae: 3248.0813 - val_loss: 1291578.8750 - val_mae: 802.8881
Epoch 49/50
14/14 [==============================] - 0s 32ms/step - loss: 22768166.0000 - mae: 3051.8604 - val_loss: 2150007.0000 - val_mae: 1233.3048
Epoch 50/50
14/14 [==============================] - 0s 33ms/step - loss: 20158450.0000 - mae: 2774.0388 - val_loss: 3659237.5000 - val_mae: 1691.2209
In [ ]:
# Plot training results
epochs=50
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of two-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of two-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')
axs.legend()

plt.show()
In [ ]:
#Plot actual vs. test and validation prediction
shift = 14
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.title(f"Predicted versus actual JHU_cases (two-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction result of the two-layer model, using time steps, looks very good as can be seen in the above graph. This graph compares the actual State of California observations versus the predicted data obtained applying the model on train data, in blue, and never seen test data, in green. The predicted graph closely follows the actual data and predicts with reasonable accuracy.

In [ ]:
# Forecasting 14 days beyond our dataset
n_future=14
forecast = model_lstm_shift.predict(xtrain[-n_future:])

Stretching the results obtained with the two-layer model, we predicted two weeks beyond our dataset. A two-week forecast period is well supported because we used a look-back period with a window of 14 days.

In [ ]:
#Plot actual vs. test and validation prediction
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.plot(range(len(y_train)+len(y_test),len(y_train)+len(y_test)+len(forecast)),forecast,color='r',alpha=.8,lw=2,label="Forecast")
plt.title(f"Predicted versus actual JHU_cases (two-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The above graph for California data is similar to the previous actual vs. predicted. This time we included a two-week forecast, red color, that shows a reduction in the COVID cases number. The forecasted data is very close to what actually happened in California at that time.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_11=round(mean_MAE[0],4)
MAE_val_11=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_11)
print("The validation MAE of the model is", MAE_val_11)
The MAE of the model is 0.0349
The validation MAE of the model is 0.0237

d) Multi-layer LSTM model Using Time Steps¶

Return to contents

As a logical forward step in trying to improve the previous results, we tested the same time steps approach on a multi-level LSTM model. [9]

In [ ]:
#Using info New York
info_sel_state = sel_state("California")
info_train_test=train_test("California")

#x and y train and test variables
x_train=info_train_test[4]
x_train.drop(["date"], axis=1, inplace=True)
y_train=info_train_test[1]
x_test=info_train_test[5]
x_test.drop(["date"], axis=1, inplace=True)
y_test=info_train_test[3]

# Datasets shape
print("\nThe shape of xtrain is", x_train.shape)
print("The shape of ytrain is", y_train.shape)

print("\nThe shape of xtest is", x_test.shape)
print("The shape of ytest is", y_test.shape)
The shape of xtrain is (558, 499)
The shape of ytrain is (558,)

The shape of xtest is (186, 499)
The shape of ytest is (186,)
In [ ]:
# Generating Time Step Series
xtrain,ytrain = format_series(1,14,x_train) 
xtest,ytest = format_series(1,14,x_test) 

# Checking the test and train data shapes
print("\nThe shape of xtrain is", xtrain.shape)
print("The shape of ytrain is", ytrain.shape)

print("\nThe shape of xtest is", xtest.shape)
print("The shape of ytest is", ytest.shape)
The shape of xtrain is (544, 14, 498)
The shape of ytrain is (544, 1)

The shape of xtest is (172, 14, 498)
The shape of ytest is (172, 1)
In [ ]:
tf.keras.backend.clear_session()

n_units=64

# Model architecture with 4 LSTM layers
model_lstm_input = tf.keras.Input(shape=(xtrain.shape[1], xtrain.shape[2]))
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_lstm_input)
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_hidden)
model_hidden=tf.keras.layers.LSTM(units = n_units,    activation='relu', return_sequences=True)(model_hidden)
model_hidden=tf.keras.layers.LSTM(units = n_units//2, activation='relu', return_sequences=False)(model_hidden)
model_hidden=tf.keras.layers.Dropout(0.2)(model_hidden)
model_lstm_output=tf.keras.layers.Dense(units = 1)(model_hidden)

model_lstm_shift = tf.keras.Model(inputs=model_lstm_input, outputs=model_lstm_output, name="model_lstm_time_steps")

#Print the model architecture
print(model_lstm_shift.summary())

#Compile model
model_lstm_shift.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])

#Fit model
model_lstm_shift_results=model_lstm_shift.fit(xtrain, ytrain, 
                                               validation_split=0.2, 
                                               epochs = 20,
                                               batch_size = 32)

#Predict with model
model_lstm_shift_predict=model_lstm_shift.predict(xtest)
Model: "model_lstm_time_steps"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 14, 498)]         0         
                                                                 
 lstm (LSTM)                 (None, 14, 64)            144128    
                                                                 
 lstm_1 (LSTM)               (None, 14, 64)            33024     
                                                                 
 lstm_2 (LSTM)               (None, 14, 64)            33024     
                                                                 
 lstm_3 (LSTM)               (None, 32)                12416     
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense (Dense)               (None, 1)                 33        
                                                                 
=================================================================
Total params: 222,625
Trainable params: 222,625
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/20
14/14 [==============================] - 6s 99ms/step - loss: 210854784.0000 - mae: 8456.4004 - val_loss: 1626989.7500 - val_mae: 824.3406
Epoch 2/20
14/14 [==============================] - 1s 47ms/step - loss: 166517856.0000 - mae: 7246.9727 - val_loss: 40408248.0000 - val_mae: 6209.0830
Epoch 3/20
14/14 [==============================] - 1s 49ms/step - loss: 136117760.0000 - mae: 7788.8120 - val_loss: 91118264.0000 - val_mae: 9415.2812
Epoch 4/20
14/14 [==============================] - 1s 49ms/step - loss: 124378032.0000 - mae: 7829.9971 - val_loss: 69668576.0000 - val_mae: 8204.8271
Epoch 5/20
14/14 [==============================] - 1s 48ms/step - loss: 123590880.0000 - mae: 7596.1841 - val_loss: 136285008.0000 - val_mae: 11590.5186
Epoch 6/20
14/14 [==============================] - 1s 48ms/step - loss: 116589896.0000 - mae: 7600.4536 - val_loss: 70615160.0000 - val_mae: 8280.9229
Epoch 7/20
14/14 [==============================] - 1s 50ms/step - loss: 97223928.0000 - mae: 6433.7227 - val_loss: 59854392.0000 - val_mae: 7606.1362
Epoch 8/20
14/14 [==============================] - 1s 46ms/step - loss: 81823400.0000 - mae: 5415.7090 - val_loss: 62635788.0000 - val_mae: 7798.8047
Epoch 9/20
14/14 [==============================] - 1s 55ms/step - loss: 65767900.0000 - mae: 5412.9243 - val_loss: 41635440.0000 - val_mae: 6194.2935
Epoch 10/20
14/14 [==============================] - 1s 92ms/step - loss: 53672460.0000 - mae: 4855.0679 - val_loss: 34933376.0000 - val_mae: 5778.6099
Epoch 11/20
14/14 [==============================] - 1s 91ms/step - loss: 68724688.0000 - mae: 5481.1489 - val_loss: 3864430.0000 - val_mae: 1815.8246
Epoch 12/20
14/14 [==============================] - 1s 102ms/step - loss: 60192428.0000 - mae: 4439.5044 - val_loss: 29458972.0000 - val_mae: 5297.2212
Epoch 13/20
14/14 [==============================] - 1s 86ms/step - loss: 48400068.0000 - mae: 4248.7925 - val_loss: 4137835.5000 - val_mae: 1897.6862
Epoch 14/20
14/14 [==============================] - 1s 90ms/step - loss: 41948700.0000 - mae: 3560.7749 - val_loss: 5535162.5000 - val_mae: 2220.7200
Epoch 15/20
14/14 [==============================] - 1s 106ms/step - loss: 31208060.0000 - mae: 3005.7080 - val_loss: 2950685.7500 - val_mae: 1422.2139
Epoch 16/20
14/14 [==============================] - 1s 104ms/step - loss: 24787758.0000 - mae: 2893.1052 - val_loss: 1969769.5000 - val_mae: 1103.8224
Epoch 17/20
14/14 [==============================] - 2s 108ms/step - loss: 37147716.0000 - mae: 3289.9250 - val_loss: 1402637.3750 - val_mae: 795.3813
Epoch 18/20
14/14 [==============================] - 2s 111ms/step - loss: 35615892.0000 - mae: 3279.9409 - val_loss: 14235012.0000 - val_mae: 3608.4495
Epoch 19/20
14/14 [==============================] - 1s 94ms/step - loss: 31004794.0000 - mae: 3404.5710 - val_loss: 8967538.0000 - val_mae: 2756.7134
Epoch 20/20
14/14 [==============================] - 1s 100ms/step - loss: 44776784.0000 - mae: 3630.9194 - val_loss: 1412234.0000 - val_mae: 831.3953
In [ ]:
# Plot training results
epochs=20
fig = plt.figure(figsize=(20,5))
axs = fig.add_subplot(1,3,1)
axs.set_title(f'Loss of multi-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["loss","val_loss"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('Loss')
    axs.set_xlabel('epochs')
axs.legend()

axs = fig.add_subplot(1,3,2)
axs.set_title(f'MAE of multi-layer LSTM with time steps for {file_names[info_sel_state[0]]}')
# Plot all metrics
for metric in ["mae","val_mae"]:
    axs.plot(np.arange(0, epochs), model_lstm_shift_results.history[metric], label=metric)
    axs.set_ylabel('MAE')
    axs.set_xlabel('epochs')

axs.legend()

plt.show()
In [ ]:
#Plot actual vs. test and validation prediction
shift = 14
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.9)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The prediction result of the multi-layer model, using time steps, looks very good as can be seen in the above graph. This graph compares the actual State of California observations versus the predicted data obtained applying the model on train data, in blue, and never seen test data, in green. The predicted graph closely follows the actual data and predicts with reasonable accuracy.

In [ ]:
# Forecasting 14 days beyond our dataset
n_future=14
forecast = model_lstm_shift.predict(xtrain[-n_future:])

Stretching the results obtained with the two-layer model, we predicted two weeks beyond the dataset. A two-week forecast period is well supported because we used a look-back period with a window of 14 days.

In [ ]:
#Plot actual vs. test and validation prediction
plt.figure(figsize=(12,6))
plt.plot(range(len(y_train)),y_train, label="Actual",color='orange',alpha=.8)
plt.plot(range(len(y_train),len(y_train)+len(y_test)),y_test,color='orange',alpha=.8)
plt.plot(range(shift,len(ytrain)+shift), model_lstm_shift.predict(xtrain), label="Validation",color='b',lw=2)
plt.plot(range(len(ytrain)+shift,len(ytrain)+len(ytest)+shift), model_lstm_shift.predict(xtest), label="Predicted",color='g',lw=2)
plt.plot(range(len(y_train)+len(y_test),len(y_train)+len(y_test)+len(forecast)),forecast,color='r',alpha=.8,lw=2,label="Forecast")
plt.title(f"Predicted versus actual JHU_cases (multi-layer LSTM with time steps) for {file_names[info_sel_state[0]]}")
plt.xlabel("Time (in days)")
plt.ylabel("JHU_cases")
plt.legend()
plt.show()

The above graph for California data is similar to the previous actual vs. predicted. This time we included a two-week forecast, red color, that shows a reduction in the number of COVID cases. The forecasted data is very close to what actually happened in California at that time.

In [ ]:
#Calculate mean MAE
mean_MAE=[]
for metric in ["mae","val_mae"]:
    mean_MAE.append(np.mean(history.history[metric]))

MAE_12=round(mean_MAE[0],4)
MAE_val_12=round(mean_MAE[1],4)

print("The MAE of the model is", MAE_12)
print("The validation MAE of the model is", MAE_val_12)
The MAE of the model is 0.0349
The validation MAE of the model is 0.0237

V. Conclusions and way forward¶

Return to contents

In [ ]:
#Summary table of performance of models

#For validation data
pd.set_option('display.float_format', lambda x: '%.4f' % x)

table_names=["ARMA", "One-layer RNN", "Multi-layer RNN", "One-layer LSTM", "Multi-layer LSTM",
             "Two-layer LSTM w/ time steps", "Multi-layer LSTM w/ time steps"]
table_values_ny=[MAE_0_ny, MAE_val_1,MAE_val_2,MAE_val_5,MAE_val_6,MAE_val_7,MAE_val_8]
table_values_cal=[MAE_0_cal, MAE_val_3, MAE_val_4,MAE_val_9,MAE_val_10,MAE_val_11,MAE_val_12]

table_mae = pd.DataFrame(list(zip(table_values_ny, table_values_cal)),
               columns =['New York', 'California'], index=table_names)

table_mae.index.name = 'MAE values'

print("The validation mean absolute error (MAE) of the different models included in this notebook are below:\n")
print(table_mae)

#For train data

table_values_ny=[MAE_0_ny, MAE_1,MAE_2,MAE_5,MAE_6,MAE_7,MAE_8]
table_values_cal=[MAE_0_cal, MAE_3, MAE_4,MAE_9,MAE_10,MAE_11,MAE_12]

table_mae = pd.DataFrame(list(zip(table_values_ny, table_values_cal)),
               columns =['New York', 'California'], index=table_names)

table_mae.index.name = 'MAE values'

print("\n\nThe train mean absolute error (MAE) of the different models included in this notebook are below:\n")
print(table_mae)
The validation mean absolute error (MAE) of the different models included in this notebook are below:

                                New York  California
MAE values                                          
ARMA                              0.1295      0.0658
One-layer RNN                     0.0602      0.0430
Multi-layer RNN                   0.0817      0.0544
One-layer LSTM                    0.0498      0.0221
Multi-layer LSTM                  0.0493      0.0237
Two-layer LSTM w/ time steps      0.0493      0.0237
Multi-layer LSTM w/ time steps    0.0493      0.0237


The train mean absolute error (MAE) of the different models included in this notebook are below:

                                New York  California
MAE values                                          
ARMA                              0.1295      0.0658
One-layer RNN                     0.0408      0.0446
Multi-layer RNN                   0.0601      0.0715
One-layer LSTM                    0.0216      0.0255
Multi-layer LSTM                  0.0353      0.0349
Two-layer LSTM w/ time steps      0.0353      0.0349
Multi-layer LSTM w/ time steps    0.0353      0.0349

The LSTM models performed better than the RNN models and the baseline ARMA models. However, the MAE results are similar for the LSTM and RNN models. The reason why is that the MAE averages all the errors. So, the LSTM models were close to the actual values, but largely missed the mark at the end tail. Conversely, the RNN models did not perform well in any segment of the time series but did not largely missed any mark. As the MAE averages all errors, then the MAE values are similar for the LSTM and RNN models. Future work should involve using additional metrics to measure performance (for example, root mean squared errors or RMSE, which uses the standard deviation and measures how spread are the residuals) and further tune the models to improve their prediction.

VI. References¶

Return to contents

References used in the background and introduction (Section I):

(1) Lopreite et al. 2021. “Early warnings of COVID-19 outbreaks across Europe from social media”. Scientific Reports 11 (2147).
(2) Kogan et al. 2021. “An early warning approach to monitor COVID-19 activity with multiple digital traces in near real time”. Science Advances: 7.

References used in the introduction of the model results and discussion (Section IV):
(1) Wei et al. 2020. "The role of absolute humidity in the transmission of COVID-19" SPH Scholarly Articles. Harvard University.
(2) Committee to Unleash Prosperity. 2020. "Grading our governors. A report card on reopening states' economies".

References used in the read files (Sub-section 4.2):
(1) https://www.geeksforgeeks.org/how-to-read-all-csv-files-in-a-folder-in-pandas/

References used in the times series baseline model (Sub-section 4.5):

(1) https://www.datacamp.com/community/tutorials/moving-averages-in-pandas

(2) https://www.datacamp.com/community/tutorials/moving-averages-in-pandas

(3) https://www.datacamp.com/community/tutorials/moving-averages-in-pandas

(4) https://www.linkedin.com/pulse/what-mape-mad-msd-time-series-allameh-statistics/

(5) https://towardsdatascience.com/defining-the-moving-average-model-for-time-series-forecasting-in-python-626781db2502

(6) https://www.projectpro.io/recipesforecast-moving-averages-for-time-series

(7) https://www.mikulskibartosz.name/nested-cross-validation-in-time-series-forecasting-using-scikit-learn-and-statsmodels/

(8) https://machinelearningmastery.com/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras/

(9) https://analyticsindiamag.com/how-to-do-multivariate-time-series-forecasting-using-lstm/