我想合并csv文件中特定值的行

时间:2018-07-14 18:09:45

标签: python-3.x csv merge

我有一个结构类似的csv文件。我要实现的是合并颜色。就像产品代码1001一样,有不同的颜色,例如,黑色乳白色石墨,我要为1001排一行,并在一个单元格中包含所有颜色; (半冒号)分开了。我想对所有产品都这样做。

编辑

预定输出:

  

1001-BLACK-P-OS,黑色;奶油;石墨

     

1002-BLACK-P-OS,黑色;奶油

提供CSV

  

1001-BLACK-P-OS,黑色

     

1001-CREAM-P-OS,奶油

     

1001-GRAPH-P-OS,石墨

     

1002-BLACK-P-OS,黑色

     

1002-CREAM-P-OS,奶油

我正在尝试使用python,但无法执行。

with open('ascolor.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        serial=row[0]
        d=''
        for r in readCSV:
            if serial is r[0]:
                d=d+r[1]
                d=d+';'

1 个答案:

答案 0 :(得分:1)

创建数据文件:

data = """1001-BLACK-P-OS , BLACK

1001-CREAM-P-OS , CREAM

1001-GRAPH-P-OS , GRAPHITE

1002-BLACK-P-OS ,BLACK

1002-CREAM-P-OS ,CREAM"""

fn = 'ascolor.csv'

with open(fn, "w") as f:
    f.write(data)

我们可以开始重新格式化:

fn = 'ascolor.csv'
import csv    
data = {}
with open(fn) as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        if row:  # weed out any empty rows - they would cause index errors
            num = row[0].split("-")[0]   # use only the number as key into our dict
            d = data.setdefault(num,[row[0].strip()])  # create the default entry with num as key
                                               # and the old "1001-BLACK-P-OS text as first entry
            if len(d) == 1: # first time we add smth
                d.append([row[1].strip()])     # now add the first color into an inner list
            else:  # this is the second/third color for this key, append to inner list
                d[1].append(row[1].strip()) # this is kindof inefficient string concat

# after that youve got a dictionary of your data:

# print(data)
# {'1001': ['1001-BLACK-P-OS', ['BLACK', 'CREAM', 'GRAPHITE']], 
#  '1002': ['1002-BLACK-P-OS', ['BLACK', 'CREAM']]}


# when writing csv with module, always open file with newline = ""
# else you get silly empty lines inside your file. module csv will do
# all newlines needed. See example at
#    https://docs.python.org/3/library/csv.html#csv.writer
with open("done.csv","w",newline="") as f:
    writer = csv.writer(f,delimiter=",")
    for k in sorted(data.keys()):
        # this will add the 1001-BLACK-P-OS before it - I dont like that
        # writer.writerow([data[k][0],';'.join(data[k][1])]) 

        # I like this better - its just 1001 and then the colors 
        writer.writerow([k,';'.join(data[k][1])]) 

print("")
with open("done.csv","r") as f:
    print(f.read())

输出:

1001,BLACK;CREAM;GRAPHITE
1002,BLACK;CREAM

或带有注释行:

1001-BLACK-P-OS,BLACK;CREAM;GRAPHITE
1002-BLACK-P-OS,BLACK;CREAM

HTH