我试图通过两列输出2个csv文件之间的差异,并创建第三个csv文件。如何通过第0列和第3列进行以下代码比较。
SELECT questions.question_id,
questions.username,
questions.question,
userlog.user_mail,
COUNT(answers.answer) as answerCount
FROM questions
LEFT JOIN userlog ON questions.username = userlog.username
LEFT JOIN answers ON answers.question_id = questions.question_id
WHERE questions.topic_id = '0d3fb89c012b5af12e1e0'
GROUP BY questions.question_id, questions.username, questions.question, userlog.user_mail
ORDER BY questions.username, questions.question_id
答案 0 :(得分:2)
如果你想要ted.csv中没有任何相同的第三和第四列元素的行作为ted2,从ted2创建一组这些元素并在写入之前检查ted.csv中的每一行:
with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
r1, r2 = csv.reader(f1), csv.reader(f2)
st = set((row[0], row[3]) for row in r1)
wr = csv.writer(out)
for row in (row for row in r2 if (row[0],row[3]) not in st):
wr.writerow(row)
如果你真的想要symmetric difference
之类的东西,你可以从两个文件中获得唯一的行,那么从这两个文件中创建一组第三和第四列:
from itertools import chain
with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
r1, r2 = csv.reader(f1), csv.reader(f2)
st1 = set((row[0], row[3]) for row in r1)
st2 = set((row[0], row[3]) for row in r2)
f1.seek(0), f2.seek(0)
wr = csv.writer(out)
r1, r2 = csv.reader(f1), csv.reader(f2)
output1 = (row for row in r1 if (row[0], row[3]) not in st2)
output2 = (row for row in r2 if (row[0], row[3]) not in st1)
for row in chain.from_iterable((output1, output2)):
wr.writerow(row)