重新排列数据-将行分成多列

时间:2019-03-26 22:28:50

标签: python python-3.x csv

所以我的csv文件有超过100万条记录:(https://i.imgur.com/rhIhy5u.png) 我需要对数据进行不同的排列,以使重复的“参数”本身成为列/行,例如category1,category2,category3(有20多个类别且没有重复),但是所有数据都保持它们的关系。

我尝试在python中使用“ pandas”和“ csv”,但是我是一个陌生的人,我从未与此类数据有任何关系。

import csv

with open('./data.csv', 'r') as _filehandler:
    csv_file_reader = csv.reader(_filehandler)

    param = [];

    csv_file_reader = csv.DictReader(_filehandler)
    for row in csv_file_reader:
        if not row['Param'] in param:
            param.append(row['Param']);

    col = "";

    for p in param:
        col += str(p) + '; ';

    print(col);
    import numpy as np

    np.savetxt('./SortedWexdord.csv', (parameters), delimiter=';', fmt='%s')

我试图考虑一下,但是数据也不是我的专长,有什么想法吗?

1 个答案:

答案 0 :(得分:1)

这里应该起作用。如果您需要像这样标准化的每一行有多个值,则可以编辑第9行(从category开始)以获取值列表,而不仅仅是row[1]

import csv

data = {}

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    next(reader) # Skip header row
    for row in reader:
        category, value = row[0], row[1] # Assumes category is in column 0 and target value is in column 1
        if category in data:
            data[category].append(value)
        else:
            data[category] = [value] # New entry only for each unique category

with open('output.csv', 'wb') as file: # wb is write and binary, avoids double newlines on windows
    writer = csv.writer(file)
    writer.writerow(['Category', 'Value'])
    for category in data:
        print([category] + data[category])
        writer.writerow([category] + data[category]) # Make a list starting with category and then listing each value