split()数据的Python直方图

时间:2017-05-13 13:48:36

标签: python matplotlib split histogram

我正在尝试对包含浮点数的文本文件进行histgramm:

import matplotlib.pyplot as plt

c1_file = open('densEst1.txt','r')
c1_data =  c1_file.read().split()    
c1_sum = float(c1_data.__len__())

plt.hist(c1_data)
plt.show()

c1_data.__len__()的输出正常,但hist()会抛出:

C:\Python27\python.exe "C:/x.py"
Traceback (most recent call last):
  File "C:/x.py", line 7, in <module>
    plt.hist(c1_data)
  File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 2958, in hist
    stacked=stacked, data=data, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\__init__.py", line 1812, in inner
    return func(ax, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes\_axes.py", line 5995, in hist
    if len(xi) > 0:
TypeError: len() of unsized object

2 个答案:

答案 0 :(得分:2)

plt.hist调用失败的主要原因是参数c1_data是包含字符串的列表。当您open文件并read the result will be a string包含文件内容时:

  

要读取文件的内容,请调用f.read(size),其中读取一些数据并将作为字符串(在文本模式下)或字节对象(在二进制模式下)返回。 / p>

强调我的。

当你现在split这个长字符串时,你会得到一个包含字符串的列表:

  

使用 sep 作为分隔符字符串,返回字符串中的单词列表。

但是,字符串列表不是plt.hist的有效输入:

>>> import matplotlib.pyplot as plt
>>> plt.hist(['1', '2'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
      1 import matplotlib.pyplot as plt
----> 2 plt.hist(['1', '2'])

C:\...\lib\site-packages\matplotlib\pyplot.py in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, data, **kwargs)
   3079                       histtype=histtype, align=align, orientation=orientation,
   3080                       rwidth=rwidth, log=log, color=color, label=label,
-> 3081                       stacked=stacked, data=data, **kwargs)
   3082     finally:
   3083         ax._hold = washold

C:\...\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1895                     warnings.warn(msg % (label_namer, func.__name__),
   1896                                   RuntimeWarning, stacklevel=2)
-> 1897             return func(ax, *args, **kwargs)
   1898         pre_doc = inner.__doc__
   1899         if pre_doc is None:

C:\...\lib\site-packages\matplotlib\axes\_axes.py in hist(***failed resolving arguments***)
   6178             xmax = -np.inf
   6179             for xi in x:
-> 6180                 if len(xi) > 0:
   6181                     xmin = min(xmin, xi.min())
   6182                     xmax = max(xmax, xi.max())

TypeError: len() of unsized object

解决方案:

您只需将其转换为float-array:

>>> import numpy as np
>>> plt.hist(np.array(c1_data, dtype=float))

答案 1 :(得分:1)

指向使用 numpy 的示例...简单,结果如下所示。

pandas 也可以工作,分割和数据类型在读取时可用(即使是列数据),也可以读作 vector (取决于数据大小)/

# !/usr/bin/env python
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import numpy as np

# will be better to read with numpy because you use float ...
#a = np.fromfile(open('from_file', 'r'), sep='\n') 

from_file = np.array([1, 2, 2.5]) #sample data a
c1_data = from_file.astype(float) # convert the data in float

plt.hist(c1_data)  # plt.hist passes it's arguments to np.histogram
plt.title("Histogram without 'auto' bins")
plt.show()

without auto bins

plt.hist(c1_data, bins='auto')  # plt.hist passes it's arguments to np.histogram
plt.title("Histogram with 'auto' bins")
plt.show()

with 'auto' bins