Question

我正在尝试使用matplotlib（在Python 2.6中）从csv文件中绘制数据，但是我在从csv读取数据时遇到了一些麻烦：

import pylab

# works fine - manually output data for debug
with open(datafile,'rb') as f:
    for row in f:
        print row

# fails - "no data" error
a = pylab.loadtxt(datafile, comments='#', delimiter=',', skiprows=1)

手动阅读数据工作正常（with open部分）。 pylab.loadtxt代码会抛出错误：

raise IOError('End-of-file reached before encountering data.')
    IOError: End-of-file reached before encountering data.

我原本以为数据文件中的换行符存在问题（也就是说，所有内容都在一行上并被skiprows=1跳过），但我通过在记事本中手动创建测试文件来解决这个问题。看到同样的错误。以下是测试文件中的数据：

time,temperature
193,23.1
4040,23.2
4357,23.3
4423,23.4

我还尝试删除标题行并省略代码的skiplines=1部分。这也失败了，但有一个不同的错误：

ValueError: invalid literal for float(): 23.1

至少这表明它“看到”了数字数据。

我在这里做错了什么？

Answer 1

在Windows上，行分隔符为\r\n。在Unix上，行分隔符为\n。您的数据文件未遵循这些约定，这就是pylab（错误，numpy）无法正确解析文件的原因。

修复文件：

import os
outfile = datafile+'-fixed'
with open(datafile, 'rb') as f, open(outfile, 'wb') as g:
    content = f.read()
    g.write(content.replace('\r', '\r\n'))
os.rename(outfile, datafile)

Answer 2

正如@unutbu所说，问题很可能是新行\r对于Windows应该是\r\n。

如果您不想创建新文件，可以使用StringIO

from StringIO import StringIO

output = StringIO.StringIO()
with open(datafile, 'rb') as f:
    output.write( f.read().replace('\r', '\r\n') )

import pylab
a = pylab.loadtxt(output, comments='#', delimiter=',', skiprows=1)

使用pylab读取csv数据时出现python EOF错误（matplotlib）

2 个答案: