Plots With Python
A Q-Q plot, short for “quantile-quantile” plot, is often used to assess whether or not a set of data potentially came from some theoretical distribution. In most cases, this type of plot is used to determine whether or not a set of data follows a normal distribution.
This tutorial explains how to create a Q-Q plot for a set of data in Python.
Python matplotlib module provides us with various functions to plot the data and understand the distribution of the data values. The matplotlib.pyplot.bar function is used to create a Bar plot using matplotlib module. After that, we will create joint plot. Joint plot is used to plot bivariate data by specifying the kind of parameter we need. For example, we can use ‘scatter’, ‘hex’, ‘kde’, ‘reg’, etc. The general syntax for joint plot requires us to specify the x and y labels, the data we want to use and the kind of plot we need.
Example: Q-Q Plot in Python
Suppose we have the following dataset of 100 values:
To create a Q-Q plot for this dataset, we can use the qqplot() function from the statsmodels library:
Plot With Python Pandas
In a Q-Q plot, the x-axis displays the theoretical quantiles. This means it doesn’t show your actual data, but instead it represents where your data would be if it were normally distributed.
The y-axis displays your actual data. This means that if the data values fall along a roughly straight line at a 45-degree angle, then the data is normally distributed.
We can see in our Q-Q plot above that the data values tend to closely follow the 45-degree, which means the data is likely normally distributed. This shouldn’t be surprising since we generated the 100 data values by using the numpy.random.normal() function.
Consider instead if we generated a dataset of 100 uniformally distributed values and created a Q-Q plot for that dataset:
Python Plot With Error Bar
The data values clearly do not follow the red 45-degree line, which is an indication that they do not follow a normal distribution.
Notes on Q-Q Plots
Plotting Data In Python
Keep in mind the following notes about Q-Q plots:
- Although a Q-Q plot isn’t a formal statistical test, it offers an easy way to visually check whether or not a data set is normally distributed.
- Be careful not to confuse Q-Q plots with P-P plots, which are less commonly used and not as useful for analyzing data values that fall on the extreme tails of the distribution.
Make Plots With Python
You can find more Python tutorials here.