我有一个包含关键字的输入文件,并且需要根据这些关键字过滤csv文件。
这是我尝试使用python自动化任务。
import csv
with open('Input.txt', 'rb') as InputFile:
with open('28JUL2017.csv', 'rb') as CM_File:
read_Input=csv.reader(InputFile)
for row1 in csv.reader(InputFile):
#print row1
read_CM=csv.reader(CM_File)
next(read_CM, None)
for row2 in csv.reader(CM_File):
#print row2
if row1[0] == row2[0] :
Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
print Output
我只从要过滤的文件中获取第一行。尝试了各种各样的事情,却无法理解我哪里出错了。请在这里指出我的错误。
答案 0 :(得分:1)
read_Input
和read_CM
本质上是迭代器。一旦你遍历它们 - 你就完成了:你不能迭代两次。如果你坚持按照自己的方式行事,那么每次你想要开始一个新的循环时,你必须回到文件的开头并重新阅读" CSV文件。这是一个修复:
import csv
with open('file1.csv', 'rb') as InputFile:
with open('file2.csv', 'rb') as CM_File:
read_Input=csv.reader(InputFile)
for row1 in csv.reader(InputFile):
CM_File.seek(0) # rewind to the beginning of the file
read_CM=csv.reader(CM_File)
next(read_CM, None)
for row2 in csv.reader(CM_File):
if row1[0] == row2[0] :
Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
print Output
而不是这个,我建议你循环已经读取的行而不是重新读取文件。此外,不是使用嵌套循环,而是创建一个"关键字列表"只需检查row2[0]
是否在该列表中:
import csv
with open('file1.csv', 'rb') as InputFile:
with open('file2.csv', 'rb') as CM_File:
read_Input = csv.reader(InputFile) # read file only once
keywords = [rec[0] for rec in read_Input]
read_CM = csv.reader(CM_File) # read file only once
next(read_CM, None) # not sure why you do this? to skip first line?
for row2 in read_CM:
if row2[0] in keywords:
Output = row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
print("Output: {}".format(Output))