如何从数据列表中生成直方图

时间:2014-07-17 17:19:27

标签: python matplotlib error-handling

好吧我认为matplotlib已经下载但是我的新脚本出现了这个错误:

/usr/lib64/python2.6/site-packages/matplotlib/backends/backend_gtk.py:621:     DeprecationWarning: Use the new widget gtk.Tooltip
  self.tooltips = gtk.Tooltips()
Traceback (most recent call last):
  File "vector_final", line 42, in <module>
plt.hist(data, num_bins)
  File "/usr/lib64/python2.6/site-packages/matplotlib/pyplot.py", line 2008, in hist
ret = ax.hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, **kwargs)
  File "/usr/lib64/python2.6/site-packages/matplotlib/axes.py", line 7098, in hist
w = [None]*len(x)
TypeError: len() of unsized object

我的代码是:     #!的/ usr / bin中/ Python的

l=[]
with open("testdata") as f:
    line = f.next()
    f.next()# skip headers
    nat = int(line.split()[0])
    print nat

    for line in f:
        if line.strip():
          if line.strip():
            l.append(map(float,line.split()[1:]))  


    b = 0
    a = 1
for b in range(53):
    for a in range(b+1,54):
        import operator
        import matplotlib.pyplot as plt
        import numpy as np

        vector1 = (l[b][0],l[b][1],l[b][2])
        vector2 = (l[a][0],l[a][1],l[a][2])

            x = vector1
            y = vector2
            vector3 = list(np.array(x) - np.array(y))
            dotProduct = reduce( operator.add, map( operator.mul, vector3, vector3))


        dp = dotProduct**.5
        print dp

        data = dp
        num_bins = 200 # <- number of bins for the histogram
        plt.hist(data, num_bins)
        plt.show()

但是给我错误的代码是我添加的新增内容,这是最后一部分,转载如下:

                data = dp
                num_bins = 200 # <- number of bins for the histogram
                plt.hist(data, num_bins)
                plt.show()

2 个答案:

答案 0 :(得分:31)

  

你知道如何制作200个均匀分隔的箱子,并且有   你的程序将数据存储在适当的箱子里?

例如,您可以将NumPy的arange用于固定的bin大小(或Python的标准范围对象),并将NumPy的linspace用于均匀分布的bin。以下是我matplotlib gallery

中的两个简单示例

固定箱尺寸

import numpy as np
import random
from matplotlib import pyplot as plt

data = np.random.normal(0, 20, 1000) 

# fixed bin size
bins = np.arange(-100, 100, 5) # fixed bin size

plt.xlim([min(data)-5, max(data)+5])

plt.hist(data, bins=bins, alpha=0.5)
plt.title('Random Gaussian data (fixed bin size)')
plt.xlabel('variable X (bin size = 5)')
plt.ylabel('count')

plt.show()

enter image description here

固定数量的箱子

import numpy as np
import math
from matplotlib import pyplot as plt

data = np.random.normal(0, 20, 1000) 

bins = np.linspace(math.ceil(min(data)), 
                   math.floor(max(data)),
                   20) # fixed number of bins

plt.xlim([min(data)-5, max(data)+5])

plt.hist(data, bins=bins, alpha=0.5)
plt.title('Random Gaussian data (fixed number of bins)')
plt.xlabel('variable X (20 evenly spaced bins)')
plt.ylabel('count')

plt.show()

enter image description here

答案 1 :(得分:1)

<块引用>

如何制作 200 个均匀分布的 bin,并让您的程序将数据存储在适当的 bin 中?

接受的答案使用 numpy.arangenumpy.linspace 手动创建 200 个分箱,但有自动分箱功能:

  1. numpy.histogram 返回直接与 pyplot.stairs (new in matplotlib 3.4.0) 一起使用的边

    values, edges = np.histogram(data, bins=200)
    plt.stairs(values, edges, fill=True)
    
  2. pandas.cut 返回直接使用 pyplot.hist

    的 bin
    _, bins = pd.cut(data, bins=200, retbins=True)
    plt.hist(data, bins)
    

    histogram output


如果您不需要存储分箱,则跳过分箱步骤,只需将 bins 作为整数绘制直方图:

  1. pyplot.hist

    plt.hist(data, bins=200)
    
  2. seaborn.histplot

    sns.histplot(data, bins=200)
    
  3. pandas.DataFrame[.plot].histpandas.Series[.plot].hist

    pd.Series(data).plot.hist(bins=200)