在Python列中对应的行值组

时间:2018-06-14 05:20:52

标签: python python-3.x list

有一个包含6列的.csv文件。分组是根据第1列的重复值进行的;相应列中的所有值(剩余5个)应合并为单个列。 样本数据

col1    col2            col3        col4    col5    col6
1234    Some Text       Reg1        Value1  Txt      A
2345    Any Text        Reg1        Value2  Txt      B
3456    Some Text       Reg2        Value3  Txt      C
1234    Another Text    Reg3        Value2  Txt      D

Sample Data Screenshot

下面是代码,我正在使用

import csv
import sys

if len(sys.argv) < 2:
   print('To few arguments, please specify input filename')
   sys.exit()

filename = sys.argv[1]

accs = {}
with open(filename, mode='rU') as f:

reader = csv.reader(f, delimiter=',')
for n, row in enumerate(reader):
    if not n:
        # Skip header row (n = 0).
        continue  
    acc, res, reg, col4, col5, col6 = row
    if acc not in accs:
        accs[acc] = list()
    accs[acc].append((res,reg,col4,col5,col6))

def listToStringWithoutBrackets(list1):
return str(list1).replace('[','').replace(']','')

with open('output.csv', 'w', newline='') as csvFile:
writer = csv.writer(csvFile)
for key, value in accs.items():
    writer.writerow([key, listToStringWithoutBrackets(value)])


print("Output file, with a name Output.csv has been created in the current working directory!")

我已经提到过Sample Code for Group by in Python以下是我得到的结果(因为我使用了List作为Col1和其他列作为值)

col1    col2
1234    (Some Text,Reg1,Value1,Txt,A),(Another Text, Reg3, Value2, Txt, D)
2345    (Any Text, Reg1, Value2, Txt, B)
3456    (Some Text,Reg2,Value3, Txt,C)

预期结果(需要实际结果)

col1    col2                    col3        col4            col5        col6
1234    Some Text, Another Text Reg1, Reg3  Value1, Value2  Txt, Txt    A,D
2345    Any Text                Reg1        Value2          Txt         B
3456    Some Text               Reg2        Value3          Txt         C

使用第三方库需要

<\ n>

任何帮助!

1 个答案:

答案 0 :(得分:0)

import csv

reader = csv.reader(open('1.csv'), delimiter=',')
ans = dict()
for n, row in enumerate(reader):
    if n == 0:
        titles = row
    else:
        num = row[0]
        if num in ans:
            t = ans[num]
            for i in range(1, len(row)):
                t[i].append(row[i])
        else:
            tmp = []
            for other in row:
                tmp.append([other])
            ans[num] = tmp

writer = csv.writer(open("2.csv", 'w'))
writer.writerow(titles)
for row in ans.values():
    tmp = []
    for key in row:
        if isinstance(t, str):
            tmp.append(key)
        else:
            tmp.append(",".join(key))
    writer.writerow(tmp)