COVID 19 STUDY OF STATE OF MAHARASHTRA USING DATA SCIENCE as on 08-05-2020

We now study the COVID19 Data of the State of Maharashtra where the COVID crisis is headed towards the worse as on 08-05-2020.

We run the code

is_subset1_Maharashtra=subset1.STUT == "Maharashtra"
subset1[is_subset1_Maharashtra]
The dataframe is loaded for the State of Maharashtra

Total rows are 61 and columns are 5 updated to 08/05/2020

We call this subset as dfMaharashtra and run the following code

The linear regression of Confirmed Vs Cured is obtained by running the following codes

X = dfMaharashtra.drop('Confirmed',axis = 1)
y = dfMaharashtra[['Confirmed']]
seed = 10
test_data_size = 0.3 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = test_data_size, random_state = seed) 
train_data = pd.concat([X_train, y_train], axis = 1) 
test_data = pd.concat([X_test, y_test], axis = 1) 
fig, ax = plt.subplots(figsize=(12, 6))
sns.regplot(x='Confirmed', y='Cured', ci=None, data=train_data, ax=ax, color='k', scatter_kws={"s": 20,"color":"royalblue", "alpha":1})
The Regression lines is as follows
From the graph we can see that the blue dots are well below the regression line indicating less Cured against Confirmed 
The log plot for the data is obtained by running the following codes
fig, ax = plt.subplots(figsize=(12, 6)) 
y = np.log(train_data['Confirmed'])
sns.regplot(x='Cured', y=y, ci=95, data=train_data, ax=ax, color='k', scatter_kws={"s": 10,"color": "royalblue", "alpha":1})
ax.set_ylabel('log of Confirmed', fontsize=15,fontname='DejaVu Sans') 
ax.set_xlabel("Cured",fontsize=15, fontname='DejaVu Sans') 
ax.set_xlim(left=None, right=None) 
ax.set_ylim(bottom=None, top=None) 
ax.tick_params(axis='both', which='major', labelsize=12) 
fig.tight_layout()

The plot shows that the log curve moving parallel to the cured axis. The number of Cured being stagnant or less than 
confirmed cases.

X = dfMaharashtra.drop('Confirmed',axis = 1)

y = dfMaharashtra[['Confirmed']]

seed = 10

test_data_size = 0.3

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = test_data_size, random_state = seed)

train_data = pd.concat([X_train, y_train], axis = 1)

test_data = pd.concat([X_test, y_test], axis = 1)

fig, ax = plt.subplots(figsize=(12, 6))

sns.regplot(x='Confirmed', y='Deaths', ci=None, data=train_data, ax=ax, color='k', scatter_kws={"s": 20,"color":"royalblue", "alpha":1})

We get the following graph

From the graph, we can see that the data shown in blue points are initially moving below and then above the line and finally as the figures keep increasing

it is below the line indicating irregular Confirmed Vs Cured correlation.

The log graph for the data is obtained by running the following codes

fig, ax = plt.subplots(figsize=(12, 6))

y = np.log(train_data['Confirmed'])

sns.regplot(x='Deaths', y=y, ci=95, data=train_data, ax=ax, color='k', scatter_kws={"s": 10,"color": "royalblue", "alpha":1})

ax.set_ylabel('log of Confirmed', fontsize=15,fontname='DejaVu Sans')

ax.set_xlabel("Deaths",fontsize=15, fontname='DejaVu Sans')

ax.set_xlim(left=None, right=None)

ax.set_ylim(bottom=None, top=None)

ax.tick_params(axis='both', which='major', labelsize=12)

fig.tight_layout()

The blue dots do not fall on the line linearly which indicates the variation between the Confirmed Vs Deaths to be high. The number of deaths being high.

The Heatmap showing the Correlation Matrix using Pearsons , for the State of Maharashtra is obtained by using the following codes,

corrMatrix = train_data.corr(method = 'pearson') 
xnames=list(train_data.columns) 
ynames=list(train_data.columns) 
plot_corr(corrMatrix, xnames=xnames, ynames=ynames,title=None,normcolor=False, cmap='RdYlBu_r')

and finally the Correlation Coefficient between the various variables used in our data frame 
train_data.corr (method = 'pearson')
There is a high degree of correlation between the variables (Confirmed, Cured) being 0.993 and that 
between (Confirmed, Deaths) being 0.99.
This helps to further our study of using advanced Training Models of Machine Learning to bring out a favourable result. 

Search This Blog

COVID19 - DATA SCIENCE ANALYSIS

COVID 19 STUDY OF STATE OF MAHARASHTRA USING DATA SCIENCE as on 08-05-2020

Comments

Post a Comment