我需要提取并将第一列具有相同值的多行合并为一行。输入的csv如下所示:
来源:
20191111,test7,10,0,0,0
20191111,test6,0,9,0,0
20191111,test5,0,0,8,0
20191111,test3,0,0,0,7
20191111,test2,0,0,0,0
20191111,test1,0,0,0,0
20191110,test7,0,0,0,0
20191110,test6,0,0,0,0
20191110,test5,0,0,0,0
20191110,test3,0,0,0,0
20191110,test2,0,0,0,0
20191110,test1,0,0,0,0
target:
20191111,test7,10,0,0,0,test6,0,9,0,0,test5,0,0,8,0, .....
20191110,test7,0,0,0,0,test6,0,0,0,0,test5,0,0,0,0, .....
答案 0 :(得分:0)
像这样的事情应该起作用。编写此代码不会伤害大熊猫。
import collections
import csv
import io
import itertools
import sys
file = io.StringIO(
"""
20191111,test7,10,0,0,0
20191111,test6,0,9,0,0
20191111,test5,0,0,8,0
20191111,test3,0,0,0,7
20191111,test2,0,0,0,0
20191111,test1,0,0,0,0
20191110,test7,0,0,0,0
20191110,test6,0,0,0,0
20191110,test5,0,0,0,0
20191110,test3,0,0,0,0
20191110,test2,0,0,0,0
20191110,test1,0,0,0,0
""".strip()
)
groups = collections.defaultdict(list)
for row in csv.reader(file):
groups[row[0]].append(row) # storing the full row here, for greater reusability
out = csv.writer(sys.stdout)
# NB: `groups` aren't (necessarily) in any sorted order;
# could add e.g. `sorted(groups.items())` here to sort by the key
for group_key, rows in groups.items():
# Build the transposed row from the group key, then the rows sans the first column of each
transposed_row = [group_key] + list(itertools.chain(*[row[1:] for row in rows]))
# Write to the CSV writer; you could append to a dataframe or anything else here.
out.writerow(transposed_row)