如何在Python中绘制条形高度是bin宽度函​​数的直方图?

时间:2016-11-18 16:14:51

标签: python matplotlib plot height histogram

我有 数据

[-152, -132, -132, -128, -122, -121, -120, -113, -112, -108, 
-107, -107, -106, -106, -106, -105, -101, -101, -99, -89, -87, 
-86, -83, -83, -80, -80, -79, -74, -74, -74, -71, -71, -69, 
-67, -67, -65, -62, -61, -60, -60, -59, -55, -54, -54, -52, 
-50, -49, -48, -48, -47, -44, -43, -38, -37, -35, -34, -34, 
-29, -27, -27, -26, -24, -24, -19, -19, -19, -19, -18, -16, 
-16, -16, -15, -14, -14, -12, -12, -12, -4, -1, 0, 0, 1, 2, 7, 
14, 14, 14, 14, 18, 18, 19, 24, 29, 29, 41, 45, 51, 72, 150, 155]

我想通过使用这些 bins 的直方图来绘制它:

[-160,-110,-90,-70,-40,-10,20,50,80,160]

我已将此代码用于:

import matplotlib.pyplot as plt
...
plt.hist(data, bins)
plt.show()

但是这个图的问题是条形高度不是根据条形宽度,因为频率应该象征条形区域(见this page)。那么我怎么能绘制这种类型的直方图? 提前谢谢。

2 个答案:

答案 0 :(得分:1)

来自docstring

  

normed:布尔值,可选

     

如果为True,则返回元组的第一个元素将是计数   归一化形成概率密度,即n /(len(x)`dbin),即   直方图的积分将总和为1.如果堆叠也为真,   直方图的总和归一化为1。

     

默认为False

plt.hist(data, bins=bins, normed=True)

enter image description here

答案 1 :(得分:0)

感谢Nikos Tavoularis this post

我的解决方案代码:

import requests
from bs4 import BeautifulSoup
import re
import matplotlib.pyplot as plt
import numpy as np

regex = r"((-?\d+(\s?,\s?)?)+)\n"
page = requests.get('http://www.stat.berkeley.edu/~stark/SticiGui/Text/histograms.htm')
soup = BeautifulSoup(page.text, 'lxml')
# La data se halla dentro de los scripts y no dentro de la etiqueta html TABLE
scripts = soup.find_all('script')
target = scripts[23].string
hits = re.findall(regex, target, flags=re.MULTILINE)
data = []
if hits:
    for val, _, _ in hits:
        data.extend([int(x) for x in re.findall(r"-?\d+", val)])
print(sorted(data))
print('Length of data:', len(data), "\n")

# Intervals
bins = np.array([-160, -110, -90, -70, -40, -10, 20, 50, 80, 160])

# calculating histogram
widths = bins[1:] - bins[:-1]
freqs = np.histogram(data, bins)[0]
heights = freqs / widths
mainlabel = 'The deviations of the 100 measurements from a ' \
                'base value of {}, times {}'.format(r'$9.792838\ ^m/s^2$', r'$10^8$')
hlabel = 'Data gravity'

# plot with various axes scales
plt.close('all')
fig = plt.figure()
plt.suptitle(mainlabel, fontsize=16)
# My screen resolution is: 1920x1080
plt.get_current_fig_manager().window.wm_geometry("900x1100+1050+0")

# Bar chart
ax1 = plt.subplot(211)  # 2-rows, 1-column, position-1
barlist = plt.bar(bins[:-1], heights, width=widths, facecolor='yellow', alpha=0.7, edgecolor='gray')
plt.title('Bar chart')
plt.xlabel(hlabel, labelpad=30)
plt.ylabel('Heights')
plt.xticks(bins, fontsize=10)
# Change the colors of bars at the edges...
twentyfifth, seventyfifth = np.percentile(data, [25, 75])
for patch, rightside, leftside in zip(barlist, bins[1:], bins[:-1]):
    if rightside < twentyfifth:
        patch.set_facecolor('green')
    elif leftside > seventyfifth:
        patch.set_facecolor('red')
# code from: https://stackoverflow.com/questions/6352740/matplotlib-label-each-bin
# Label the raw counts and the percentages below the x-axis...
bin_centers = 0.5 * np.diff(bins) + bins[:-1]
for count, x in zip(freqs, bin_centers):
    # Label the raw counts
    ax1.annotate(str(count), xy=(x, 0), xycoords=('data', 'axes fraction'),
                    xytext=(0, -18), textcoords='offset points', va='top', ha='center', fontsize=9)

    # Label the percentages
    percent = '%0.0f%%' % (100 * float(count) / freqs.sum())
    ax1.annotate(percent, xy=(x, 0), xycoords=('data', 'axes fraction'),
                    xytext=(0, -28), textcoords='offset points', va='top', ha='center', fontsize=9)
plt.grid(True)

# Histogram Plot
ax2 = plt.subplot(223)  # 2-rows, 2-column, position-3
plt.hist(data, bins, alpha=0.5)
plt.title('Histogram')
plt.xlabel(hlabel)
plt.ylabel('Frequency')
plt.grid(True)

# Histogram Plot
ax3 = plt.subplot(224)  # 2-rows, 2-column, position-4
plt.hist(data, bins, alpha=0.5, normed=True, facecolor='g')
plt.title('Histogram (normed)')
plt.xlabel(hlabel)
plt.ylabel('???')
plt.grid(True)

plt.tight_layout(pad=1.5, w_pad=0, h_pad=0)
plt.show()

enter image description here