Question

我已经成功地将所有csv文件组合在一个目录中，但是却无法跳过每个文件的第一行（标题）。我目前得到的错误是“'list'对象不是迭代器”。我尝试了多种方法，包括不使用[open（thefile）.read（）]，但仍然无法使其正常工作。这是我的代码：

 import glob
 files = glob.glob( '*.csv' )
 output="combined.csv"

 with open(output, 'w' ) as result:
     for thefile in files:
         f = [open(thefile).read()]
         next(f)   ## this line is causing the error 'list' object is not an iterator

         for line in f:
             result.write( line )
 message = 'file created'
 print (message)

Answer 1

使用readlines()功能代替read()，以便您可以轻松跳过第一行。

f = open(thefile)
m = f.readlines()
for line in m[1:]:
    result.write(line.rstrip())
f.close()

或

with open(thefile) as f: m = f.readlines() for line in m[1:]: result.write(line.rstrip())

如果通过with语句打开文件，则无需显式关闭文件对象。

Answer 2

以下是使用遗忘fileinput.input()方法的替代方法：

import fileinput
from glob import glob

FILE_PATTERN = '*.csv'
output = 'combined.csv'

with open(output, 'w') as output:
    for line in fileinput.input(glob(FILE_PATTERN)):
        if not fileinput.isfirstline():
            output.write(line)

它比许多其他解决方案更清洁。

请注意，您的问题中的代码并不遥远。你只需要改变

f = [open(thefile).read()]

到

f = open(thefile)

但我建议使用with会更好，因为它会自动关闭输入文件：

with open(output, 'w' ) as result:
    for thefile in files:
        with open(thefile) as f:
            next(f)
            for line in f:
                result.write( line )

Answer 3

>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator

我不确定你为什么选择括号阅读，但你应该认识到上面例子中发生了什么。

已经有了一个很好的答案。这只是您如何看待问题的一个示例。另外，我建议只使用一个文件来获取您想要使用的内容。在此之后，导入glob并在更大的问题中使用迷你解决方案。

将Python中的csv与跳过标题行Error结合起来

3 个答案: