我正在尝试从一个大的csv文件中获取3列并找到排列以便仅保留唯一的三元组并将其放入另一个csv中。
例如,如果我有:
[8,9,15]
[78,35,98]
[90,35,56]
[64,89,98]
[15,8,9]...etc
必须发现第一个三联体与第五个三联体相同,只保留其中一个。我写了以下内容,但它不起作用。
import csv
reader=csv.reader(open('file1.csv','r'), delimiter = ',')
writer=csv.writer(open('mynew.csv', 'w'), delimiter=',')
myset = set()
for row in reader:
if row[0] not in myset:
writer.writerow(row)
if row[1] not in myset:
writer.writerow(row)
if row[2] not in myset:
writer.writerow(row)
答案 0 :(得分:0)
试试这个:
#!/usr/bin/env python
import csv
reader=csv.reader(open('file1.csv','r'), delimiter = ',')
writer=csv.writer(open('mynew.csv', 'w'), delimiter=',')
myset = set()
for row in reader:
print "adding %s" % row
# a frozen set is hashable and can be inserted to a set
# this assumes no duplicates exist within the row like 1,1,2,3,4 (two 1's)
# (otherwise you'll have to hash the row yourself)
myset.add(frozenset(row))
print "set size: %d" % len(myset)
print myset