我有一个CSV文件,其中包含
等行A,apple,102
A,orange,103
B,banana,101
C,peach,102
B,orange,104
以此类推...
我想删除第一列中具有重复值的行,上面的输出应为:
A,apple,102
B,banana,101
C,peach,102
答案 0 :(得分:0)
您可以创建一个空集并将第一列的值添加到其中。如果已经在集合中,则跳到下一行,例如:
import csv
column_values = set()
new_rows = []
with open('example.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
if (row[0] in column_values):
continue
column_values.add(row[0])
new_rows.append(row)
with open('updated.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(new_rows)
答案 1 :(得分:0)
itertools recipes中有一个unique_everseen
的配方(此处略有改动)。可能在这里有点矫kill过正,但它可以起作用:
from io import StringIO
from csv import reader
from operator import itemgetter
def unique_everseen(iterable, key):
"List unique elements, preserving order. Remember all elements ever seen."
seen = set()
seen_add = seen.add
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
txt = '''A,apple,102
A,orange,103
B,banana,101
C,peach,102
B,orange,104'''
with StringIO(txt) as file:
rows = reader(file)
unique_rows = unique_everseen(rows, key=itemgetter(0))
for row in unique_rows:
print(row)
我将operator.itemgetter(0)
用作key
,以便选择行中的第一列。
然后您可以使用row
将csv.writer
写入新文件。
当然,您必须将StringIO(txt)
替换为open('file.csv', 'r')
。
答案 2 :(得分:0)
如果您愿意使用第三方库,则可以使用熊猫:
Warning: Error in do.call: second argument must be a list
131: stop
130: do.call
129: hot_to_r
127: eventReactiveHandler [C:/Users/Mykhalo Petrovskyy/Desktop/Accessible Project/R_Econ_App/new.R#59]
83: df_new
79: func [C:/Users/Mykhalo Petrovskyy/Desktop/Accessible Project/R_Econ_App/new.R#63]
78: origRenderFunc
77: output$tbl
1: runApp