我有一个类似这样的CSV文件:
['Name1','','','','','','','','','','','',''','','', ', '','','','','+'] ['Name1','','','','','','b','','', '','','','','','','','','','','','','['Name2','',''' , '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '一个”, '']
['Name3','','','','','+','','','','','','','',' ”, '','','','','','','']
现在,我需要一种方法将具有相同第一列名称的所有行连接到一列中,例如:
['Name1','','','','','','b','','','','','','','',''' ,'', '','','','','+'] ['Name2','','', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '一个”, '']
['Name3','','','','','+','','','','','','','',' ”, '','','','','','','']
我可以想办法通过对CSV进行排序,然后通过每一行和每一行来比较每个值来实现这一点,但应该有一种更简单的方法。
有什么想法吗?
答案 0 :(得分:3)
你应该使用itertools.groupby:
t = [
['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'],
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''],
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
]
from itertools import groupby
# TODO: if you need to speed things up you can use operator.itemgetter
# for both sorting and grouping
for name, rows in groupby(sorted(t), lambda x:x[0]):
print join_rows(rows)
很明显,您可以在单独的函数中实现合并。例如:
def join_rows(rows):
def join_tuple(tup):
for x in tup:
if x:
return x
else:
return ''
return [join_tuple(x) for x in zip(*rows)]
答案 1 :(得分:1)
def merge_rows(row1, row2):
# merge two rows with the same name
merged_row = ...
return merged_row
r1 = ['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+']
r2 = ['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
r3 = ['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '']
r4 = ['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
rows = [r1, r2, r3, r4]
data = {}
for row in rows:
name = row[0]
if name in data:
data[name] = merge_rows(row, data[name])
else:
data[name] = row
您现在拥有data
中的所有行,其中此词典的每个键都是名称,相应的值是该行。您现在可以将此数据写入CSV文件。
答案 2 :(得分:0)
您还可以使用defaultdict
:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> _ = [d[i[0]].append(z) for i in t for z in i[1:]]
>>> d['Name1']
['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
然后加入专栏