Data Visualization of Sample Super Store Sales on Geographical Map Using Python

Visualizing Sales on Geographical Map Using Python

# Underlying Data

In this blog we will use python to plot and visualize total Sales on the geographical Map based upon the state. That is total sales for each state is plotted and visualized in Geographical Map.

For this use case we will use SampleSuperStore Data and Geo Spatial Data for US States. The Geo Spatial Data can be downloaded from the below link :

https://github.com/python-visualization/folium/tree/master/examples/data

We will use python library “folium” to plot the desired details on Geographical map using Geo Spatial Data.

To install a new python library please refer to the below link :

https://instrovate.com/2019/06/04/image-processing-1-getting-started-with-opencv-for-python/

The first step is to declare the paths with filename of the above two requried files mentioned above.

We will use Pandas (A python library) to read the Sample Super Store Data stored in Excel file

Code is as below

In [1]:
# -*- coding: utf-8 -*-
"""
Created on Fri Jun 28 13:32:00 2019

@author: Instrovate
"""
import folium
import pandas as pd
import os
 
spaitial_data_file_name = 'us-states.json'
 
sample_super_store = 'SampleSuper.xls'

SampleSuperdata = pd.read_excel(sample_super_store)
# Selecting Only Useful Columns

Now we will select only State and Sales as we will be visualizing total sales for each state

In [2]:
usefull_columns = SampleSuperdata[["State","Sales"]]
usefull_columns
Out[2]:
State Sales
0 Kentucky 261.9600
1 Kentucky 731.9400
2 California 14.6200
3 Florida 957.5775
4 Florida 22.3680
5 California 48.8600
6 California 7.2800
7 California 907.1520
8 California 18.5040
9 California 114.9000
10 California 1706.1840
11 California 911.4240
12 North Carolina 15.5520
13 Washington 407.9760
14 Texas 68.8100
15 Texas 2.5440
16 Wisconsin 665.8800
17 Utah 55.5000
18 California 8.5600
19 California 213.4800
20 California 22.7200
21 Nebraska 19.4600
22 Nebraska 60.3400
23 Pennsylvania 71.3720
24 Utah 1044.6300
25 California 11.6480
26 California 90.5700
27 Pennsylvania 3083.4300
28 Pennsylvania 9.6180
29 Pennsylvania 124.2000
9964 Delaware 13.4000
9965 Delaware 4.9800
9966 Delaware 109.6900
9967 New Jersey 40.2000
9968 New Jersey 735.9800
9969 New Jersey 22.7500
9970 Georgia 119.5600
9971 Georgia 140.7500
9972 Texas 99.5680
9973 California 271.9600
9974 California 18.6900
9975 California 13.3600
9976 California 249.5840
9977 California 13.8600
9978 California 13.3760
9979 California 437.4720
9980 Louisiana 85.9800
9981 Ohio 16.5200
9982 Michigan 35.5600
9983 Michigan 97.9800
9984 New York 31.5000
9985 New York 55.6000
9986 California 36.2400
9987 Georgia 79.9900
9988 Georgia 206.1000
9989 Florida 25.2480
9990 California 91.9600
9991 California 258.5760
9992 California 29.6000
9993 California 243.1600

9994 rows × 2 columns

# Sales Aggregation

Since, the above data depict the sales for ech each order occured. So, for each state it will be having multiple entries as there could be multiple order made for each state over perioid of time.

So, the next step is to aggregate the sales value based on state. We will take in consideration total sales value for each state. This can be done by Pandas Data frame Data manipulation function “groupby” as defined below:

In [3]:
state_data = usefull_columns.groupby(["State"],as_index=False).sum()
state_data
Out[3]:
State Sales
0 Alabama 19510.6400
1 Arizona 35282.0010
2 Arkansas 11678.1300
3 California 457687.6315
4 Colorado 32108.1180
5 Connecticut 13384.3570
6 Delaware 27451.0690
7 District of Columbia 2865.0200
8 Florida 89473.7080
9 Georgia 49095.8400
10 Idaho 4382.4860
11 Illinois 80166.1010
12 Indiana 53555.3600
13 Iowa 4579.7600
14 Kansas 2914.3100
15 Kentucky 36591.7500
16 Louisiana 9217.0300
17 Maine 1270.5300
18 Maryland 23705.5230
19 Massachusetts 28634.4340
20 Michigan 76269.6140
21 Minnesota 29863.1500
22 Mississippi 10771.3400
23 Missouri 22205.1500
24 Montana 5589.3520
25 Nebraska 7464.9300
26 Nevada 16729.1020
27 New Hampshire 7292.5240
28 New Jersey 35764.3120
29 New Mexico 4783.5220
30 New York 310876.2710
31 North Carolina 55603.1640
32 North Dakota 919.9100
33 Ohio 78258.1360
34 Oklahoma 19683.3900
35 Oregon 17431.1500
36 Pennsylvania 116511.9140
37 Rhode Island 22627.9560
38 South Carolina 8481.7100
39 South Dakota 1315.5600
40 Tennessee 30661.8730
41 Texas 170188.0458
42 Utah 11220.0560
43 Vermont 8929.3700
44 Virginia 70636.7200
45 Washington 138641.2700
46 West Virginia 1209.8240
47 Wisconsin 32114.6100
48 Wyoming 1603.1360
# Visualizing Sales on Map

We will use python library “folium” to plot the desired details on Geographical map using Geo Spatial Data.

To install a new python library please refer to the below link :

https://instrovate.com/2019/06/04/image-processing-1-getting-started-with-opencv-for-python/

Below is the python code with output visualization on Map

In [5]:
state_sales_map = folium.Map(location=[37, -102], zoom_start=5)

 
# Set color for chloropleth:
state_sales_map.choropleth(
 geo_data=spaitial_data_file_name,
 name='choropleth',
 data=state_data,
 columns=['State', 'Sales'],
 key_on='properties.name',
 fill_color='PuBu',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Sales'
)
folium.LayerControl().add_to(state_sales_map)
 
# Save the Map to a HTML File . 
state_sales_map.save('SampleSuperStoreSalesOnMap.html')
state_sales_map

# If your browser is working fine , you should be able to see the Geographical Map below . If there is some browser
issue and you are not able to see the Map , The other way to view the Map is save the file and open it through Chrome Browser .

Out[5]:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top