使用python或C#删除csv中的行?

时间:2016-12-13 18:48:13

标签: python csv

我有一个像这样重复的csv文件:

"col1", "col2","col3"
Integer, Integer, Varchar(50)
7, 8, 21554
24, 25, 36544
"col1", "col2","col3"
Integer, Integer, Varchar(50)
7, 8, 21554
24, 25, 36544

如何剥离重复的部分,包括后面的标题,数据类型行和数据行?
我只想要这个:

"col1", "col2","col3"
Integer, Integer, Varchar(50)
7, 8, 21554
24, 25, 36544

2 个答案:

答案 0 :(得分:1)

我们甚至不需要使用csv模块。我们会记住文件的第一行是什么,然后写行,直到我们再次看到它,此时我们将停止,截断文件。

with open('infile.csv', newline='') as infile, open('outfile.csv', 'w+',  newline='')as outfile:
     first = next(infile)
     outfile.write(first)
     for line in infile:
         if line == first:
             break
         outfile.write(line)

答案 1 :(得分:0)

你可以使用csv模块(假设Python 2.x)这样做:

import csv

seen = set()
with open('duplicates.csv', 'rb') as infile, open('cleaned.csv', 'wb') as outfile:
    reader = csv.reader(infile, skipinitialspace=True)
    writer = csv.writer(outfile)
    for row in (tuple(row) for row in reader):
        if row not in seen:
            writer.writerow(row)
            seen.add(row)

print('done')