有一个包含6列的.csv文件。分组是根据第1列的重复值进行的;相应列中的所有值(剩余5个)应合并为单个列。 样本数据
col1 col2 col3 col4 col5 col6
1234 Some Text Reg1 Value1 Txt A
2345 Any Text Reg1 Value2 Txt B
3456 Some Text Reg2 Value3 Txt C
1234 Another Text Reg3 Value2 Txt D
下面是代码,我正在使用
import csv
import sys
if len(sys.argv) < 2:
print('To few arguments, please specify input filename')
sys.exit()
filename = sys.argv[1]
accs = {}
with open(filename, mode='rU') as f:
reader = csv.reader(f, delimiter=',')
for n, row in enumerate(reader):
if not n:
# Skip header row (n = 0).
continue
acc, res, reg, col4, col5, col6 = row
if acc not in accs:
accs[acc] = list()
accs[acc].append((res,reg,col4,col5,col6))
def listToStringWithoutBrackets(list1):
return str(list1).replace('[','').replace(']','')
with open('output.csv', 'w', newline='') as csvFile:
writer = csv.writer(csvFile)
for key, value in accs.items():
writer.writerow([key, listToStringWithoutBrackets(value)])
print("Output file, with a name Output.csv has been created in the current working directory!")
我已经提到过Sample Code for Group by in Python以下是我得到的结果(因为我使用了List作为Col1和其他列作为值)
col1 col2
1234 (Some Text,Reg1,Value1,Txt,A),(Another Text, Reg3, Value2, Txt, D)
2345 (Any Text, Reg1, Value2, Txt, B)
3456 (Some Text,Reg2,Value3, Txt,C)
预期结果(需要实际结果)
col1 col2 col3 col4 col5 col6
1234 Some Text, Another Text Reg1, Reg3 Value1, Value2 Txt, Txt A,D
2345 Any Text Reg1 Value2 Txt B
3456 Some Text Reg2 Value3 Txt C
使用第三方库需要
<\ n>任何帮助!
答案 0 :(得分:0)
import csv
reader = csv.reader(open('1.csv'), delimiter=',')
ans = dict()
for n, row in enumerate(reader):
if n == 0:
titles = row
else:
num = row[0]
if num in ans:
t = ans[num]
for i in range(1, len(row)):
t[i].append(row[i])
else:
tmp = []
for other in row:
tmp.append([other])
ans[num] = tmp
writer = csv.writer(open("2.csv", 'w'))
writer.writerow(titles)
for row in ans.values():
tmp = []
for key in row:
if isinstance(t, str):
tmp.append(key)
else:
tmp.append(",".join(key))
writer.writerow(tmp)