我有以下格式的两个csv文件:
第一个是outputTweetsDate.csv:
Here is some text;13.09.13 16:45
Here is more text;13.09.13 16:45
And yet another text;13.09.13 16:46
第二个文件是apiSheet.csv:
13.09.13 16:46;89.56
13.09.13 16:45;90.40
我想比较这两个文件,如果两个日期时间值匹配,请将文本和数据添加到一个新文件中(finalOutput.csv):
|89.56|,|Here is some text|
|89.56|,|Here is more text|
|90.49|,|And yet another text|
这是我到目前为止的代码:
with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
reader1 = csv.reader(in_file1,delimiter=';')
reader2 = csv.reader(in_file2,delimiter='|')
writer = csv.writer(out_file,delimiter='|')
for row1 in reader1:
for row2 in reader2:
if row1[0] == row2[1]:
data = [row1[1],row2[0]]
print data
writer.writerow(data)
我编辑了我的代码,它现在可以运行到目前为止,但它并没有正确地迭代我的所有代码。 我的输出暂时是这样的:
|89.56|,|Here is some text|
|89.56|,|Here is more text|
所以它没有向我显示第三个,即使它们是相同的。好像通过文件没有好好迭代。
谢谢!
答案 0 :(得分:0)
在读取第二行file1之前,你的第二个循环到达file2(outputTweetsDate.csv)的末尾。
试试这个代码段:
with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
reader1 = csv.reader(in_file1,delimiter=';')
reader2 = csv.reader(in_file2,delimiter='|')
writer = csv.writer(out_file,delimiter='|')
row2 = reader2.next()
for row1 in reader1:
while row2 and row1[0] <= row2[1]:
if row1[0] == row2[1]:
data = [row1[1],row2[0]]
print data
writer.writerow(data)
row2 = reader2.next()
修改的 逆序非常棘手。让我们停止尝试聪明并做一些暴力。它将完美地工作,因为文件远远低于你的RAM。
with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
reader1 = csv.reader(in_file1,delimiter=';')
reader2 = csv.reader(in_file2,delimiter='|')
writer = csv.writer(out_file,delimiter='|')
rows2 = [row for row in reader2] # all the content of file2 goes in RAM.
for row1 in reader1:
for row2 in rows2:
if row1[0] == row2[1]:
data = [row1[1],row2[0]]
print data
writer.writerow(data)