# Add Matplotlib Percentage Ticks to a Histogram

on

Matplotlib provides an easy way of converting your yaxis to percentages. It’s just a one liner

``````import matplotlib.ticker as ticker
ax.yaxis.set_major_formatter(ticker.PercentFormatter(xmax))``````

But the issue is you can’t space the yticks as you want them to be. Usually you can do this by setting yticks (`ax.set_yticks`). But the issue is, python converts the axis directly to percentages, only after setting the yticks. This means if you want to have ticks like (1%, 2%,…..(N-1)%, N%), you have to set the range and range increment such that after Matplotlib does the percentage conversion, it would look the way we want.

Essentially we have to trick Matplotlib.

``````num_of_points = 10000
num_of_bins = 20
data = np.random.randn(num_of_points) # generate random numbers from a gaussian distribution
fig, ax = plt.subplots()
ax.hist(data, bins=num_of_bins, edgecolor='black')
ax.set_title("Histogram")
ax.set_xlabel("X axis")
ax.set_ylabel("Percentage")
plt.show()``````

Next do the percentage formatting with the one liner.

``````num_of_points = 10000
num_of_bins = 20
data = np.random.randn(num_of_points) # generate random numbers from a gaussian distribution
fig, ax = plt.subplots()
ax.hist(data, bins=num_of_bins, edgecolor='black')
ax.set_title("Histogram")
ax.set_xlabel("X axis")
ax.set_ylabel("Percentage")
ax.yaxis.set_major_formatter(ticker.PercentFormatter(xmax=len(data)))
plt.show()``````

Now say we need to have percentage ticks at 1% granularity on the yaxis and that you need to figure out the maximum bar height. Luckily, the `hist` function returns the y values and the edges of the bins. Using the y values, we can calculate the maximum percentage that we would see

``(max(y_vals) / len(data))``

Add one percentage point (0.01) so that the graph would not touch the top line.

``(max(y_vals) / len(data)) + 0.01``

Round to two decimal points.

``````round((max(y_vals) / len(data)) + 0.01, 2)
y_vals, x_vals, e_ = ax.hist(data, bins=num_of_bins, edgecolor='black')y_max = round((max(y_vals) / len(data)) + 0.01, 2)``````

Now we can reverse calculate to find out the absolute y_max value since we know the percentage.

``y_abs_max = y_max * len(data)``

We need ticks at 1% granularity and `100% is equivalent to  len(data)`. So the tick interval in absolute terms should be `1% * len(data`

``tick_interval = 0.01 * len(data)``

Set the y_lim so that we would see just the part we need to see.

``ax.set_ylim(ax.get_yticks()[0], ax.get_yticks()[-1])``

The whole code would look like as follows.

``````num_of_points = 10000
num_of_bins = 20
data = np.random.randn(num_of_points)           # generate random numbers from a gaussian distribution
fig, ax = plt.subplots()
y_vals, x_vals, e_ = ax.hist(data, bins=num_of_bins, edgecolor='black')
ax.set_title("Histogram")
ax.set_xlabel("X axis")
ax.set_ylabel("Percentage")
y_max = round((max(y_vals) / len(data)) + 0.01, 2)
ax.set_yticks(ticks=np.arange(0.0, y_max * len(data), 0.01 * len(data)))
ax.set_ylim(ax.get_yticks()[0], ax.get_yticks()[-1])
ax.yaxis.set_major_formatter(ticker.PercentFormatter(xmax=len(data)))
plt.show()``````

Leave a thumbs up and subscribe  if this blog post saved your valuable time!