CLUSTERING STUDY OF COVID CASES INDIA AND FOR MAHARASHTRA, GUJARAT , KERALA , KARNATAKA UPDATED 09 - 05- 2020

Clustering algorithms are very popular in the data science industry for grouping similar data points and detecting outliers. Clustering analysis performed on data would uncover natural patterns by grouping similar data points. We will analyse the COVID19INDIA data using clustering techniques, one of the more popular techniques being Clustering with k-means. k-means is one of the most popular clustering algorithms (if not the most popular) among data scientists due to its simplicity and high performance. ts origins date back as early as 1956, when a famous mathematician named Hugo Steinhaus laid its foundations, but it was a decade later that another researcher called James MacQueen named this approach k-means. The objective of k-means is to group similar data points (or observations) together that will form a cluster which is done automatically for you from the data. I have applied the k-means algorithm to find a cluster for India and then to a few stat...