在python中为csv文件中的列绘制直方图

时间:2015-01-07 17:35:04

标签: python csv matplotlib

我正在尝试为给定csv文件中的特定列绘制直方图和散点图。我是编程的新手,我从朋友那里得到了这个代码,它显然有效,但不知怎的,我收到了这个错误。代码是:

import csv
import numpy as np
import matplotlib.pyplot as plt
f = open('Data for question 13.csv')
data = csv.reader(f)
Area = []; MajorAxisLength = []; MinorAxisLength = []; Perimeter = []
MinIntensity = []; MeanIntensity = []; MaxIntensity = []
header = [Area, MajorAxisLength, MinorAxisLength,Perimeter,MinIntensity,MeanIntensity,MaxIntensity]
for row in data:
    i = 1 
    for name in header:
        name.append(row[i])
        i = i + 1
plt.figure()
plt.hist(Area, bins=50) # error follows after this

错误:

Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    plt.hist(Area, bins=50, alpha=0.5)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/pyplot.py", line 2827, in hist
    stacked=stacked, **kwargs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 8312, in hist
    xmin = min(xmin, xi.min())
  File "/Library/Python/2.7/site-packages/numpy-1.9.0-py2.7-macosx-10.9-intel.egg/numpy/core/_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
TypeError: cannot perform reduce with flexible type

我无法摆脱这个错误。 答案应该很简单,但由于我是新手,我不知道如何处理它。

3 个答案:

答案 0 :(得分:1)

您收到错误,因为输入数据被csv解析器解释为字符串,而hist需要数字数据。您应该在添加row[i]前明确转换它们。

答案 1 :(得分:1)

假设您只想在csv文件中绘制一些数值数据,并且数据是数字(而不是文本),您可以使用与此处提到的相同的方法:How to read csv into record array in numpy?

因此,您的代码可能如下所示:

import csv
import numpy as np
import matplotlib.pyplot as plt

data = np.genfromtxt('Data for question 13.csv')  # add more parameter info if neccessary like skipping header lines
plt.figure()
plt.hist(data[:,0], bins=50)  # Area, from OP, is column 0

genfromtxt函数的信息可以在这里找到:http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html

答案 2 :(得分:0)

我不是百分百肯定,因为我没有您的数据文件。但我认为row[i]是一个字符串(而不是整数或浮点数)。您可以使用enumerate跳过第一行。 所以这应该可以解决问题:

for n,row in enumerate(data):
    if n > 0:
        i = 1 
        for name in header:
            name.append(float(row[i]))
            i = i + 1