In this blog post, I will explain Matplotlib a popular visualization library in Python
Table of Contents
matplotlib.pyplot is a plotting library used for 2D graphics in python programming. The library can be used in python scripts, web application servers and other graphical user interfaces.
Matplotlib is the “grandfather” library of data visualization with Python. It was created by John Hunter. He created it to try to replicate MatLab’s (another programming language) plotting capabilities in Python. So if you happen to be familiar with matlab, matplotlib will feel natural to you.
Types of Matplotlib Plots
- Line Chart
- Bar Graph
- Scatter Plot
- Area Plot
- Pie Plot
Let me demonstrate each one of them with an example program, Before we proceed first let us see how to import matplotlib library
from matplotlib import pyplot as plt
Let us also import numpy library for generating some data
# linearly space from 0-10 and grab 21 points x_axis = np.linspace(0,10,21) y_axis = x_axis**2
array([ 0. , 0.55, 1.1 , 1.65, 2.2 , 2.75, 3.3 , 3.85, 4.4 , 4.95, 5.5 , 6.05, 6.6 , 7.15, 7.7 , 8.25, 8.8 , 9.35, 9.9 , 10.45, 11. ])
array([ 0. , 0.25, 1. , 2.25, 4. , 6.25, 9. , 12.25, 16. , 20.25, 25. , 30.25, 36. , 42.25, 49. , 56.25, 64. , 72.25, 81. , 90.25, 100. ])
Adding Title and Labels
Now let us see how to add title, labels to our graph created by python matplotlib library.
# Adding Title and Labels plt.plot(x_axis,y_axis) plt.title("Title") plt.xlabel("X-Axis") plt.ylabel("Y-Axis")
We can also style the graph to make it better by changing the width or color of a particular line and also add some grid lines. Let me show you with an example program.
plt.plot(x_axis,y_axis,lw=5,color='red') plt.title("Title") plt.xlabel("X-Axis") plt.ylabel("Y-Axis") plt.grid(True,color='blue') plt.show()
First let us understand why do we need bar graph?
- A bar graph uses bars to compare data among different categories.
- It is well suited when you want to measure the changes over a period of time.
- Longer the bar, greater is the value.
- It can be represented horizontally or vertically.
plt.bar(x_axis,y_axis,lw=5,color='red') plt.title("Title") plt.xlabel("X-Axis") plt.ylabel("Y-Axis") plt.grid(True,color='blue') plt.show()
What is the difference between Bar Chart and Histogram:
|Bar chart is used to compare different entities||Histograms are used to show a distribution|
Histograms are useful when you have arrays or a very long list.
Lets take an use case:
Lets say we want to plot the age of population with respect to bin. Now bin refers to range of values that are divided into series intervals.
In the below example I have created a bin in the interval of 10 which means the first bin contains elements from 0-9, then 10-19 and so on.
population_age = [13,17,22,55,92,45,21,22,34,45,32,4,2,100,95,95,55,70,65,55,80,75,65,54,34,43,42,48] bins = [0,10,20,30,40,50,60,70,80,90,100] plt.hist(population_age, bins, histtype='bar', rwidth=0.8) plt.xlabel('Age Groups') plt.ylabel('Number of people') plt.title('Histogram') plt.show()
A scatter plot is a two-dimensional data visualization that uses dots to represent the values obtained for two different variables – one plotted along the x-axis and the other plotted along the y-axis.
For example this scatter plot shows the height and weight of a fictitious set of children.
height = [56,60,65,70,75] weight=[65,70,75,78,90] plt.scatter(height,weight) plt.title('Height Vs Weight') plt.xlabel("height(in)") plt.ylabel("weight(lb)")
These are also known as Stack Plots. These plots can be used to track changes overtime for two or more related groups.
For example, let’s compile the work done during a day into categories, say sleeping, eating, working and playing. Consider the below code:
days = [1,2,3,4,5] sleeping =[7,8,6,11,7] eating = [2,3,4,3,2] working =[7,8,7,2,2] playing = [8,5,7,8,13] plt.plot(,,color='m', label='Sleeping', linewidth=5) plt.plot(,,color='c', label='Eating', linewidth=5) plt.plot(,,color='r', label='Working', linewidth=5) plt.plot(,,color='k', label='Playing', linewidth=5) plt.stackplot(days, sleeping,eating,working,playing, colors=['m','c','r','k']) plt.xlabel('x') plt.ylabel('y') plt.title('Stack Plot') plt.legend() plt.show()
A pie chart refers to a circular graph which is broken down into segments i.e. slices of pie. It is basically used to show the percentage or proportional data where each slice of pie represents a category. Let’s have a look at the below example:
import matplotlib.pyplot as plt days = [1,2,3,4,5] sleeping =[7,8,6,11,7] eating = [2,3,4,3,2] working =[7,8,7,2,2] playing = [8,5,7,8,13] slices = [7,2,2,13] activities = ['sleeping','eating','working','playing'] cols = ['c','m','r','b'] plt.pie(slices, labels=activities, colors=cols, startangle=90, shadow= True, explode=(0,0.1,0,0), autopct='%1.1f%%') plt.title('Pie Plot') plt.show()
Pros of Matplotlib
- Generally easy to get started for simple plots
- Support for custom labels and texts
- Great control of every element in a figure
- High-quality output in many formats
- Very customizable in general
Official Matplotlib webpage : http://matplotlib.org/
NPN Training’s Data Science Training lets you gain expertise in R, Python, Machine Learning Algorithms like K-Means Clustering, Decision Trees, Random Forest, and Naive Bayes.