Pandas转动多行标题CSV

时间:2017-03-30 18:20:15

标签: python csv pandas

我有一个csv文件,如下所示:

   Group A             Group B             Group C                 
ID Date1     Date2     Date1     Date2     Date1     Date2
0  0.030626  0.494912  0.364742  0.320088  0.364742  0.364742
1  0.178368  0.857469  0.628677  0.705226  0.364742  0.364742

如何将其读入数据框并进行数据透视,以便生成下表?

ID Date   Group A   Group B   Group C
0  Date1  0.030626  0.364742  0.364742
0  Date2  0.494912  0.320088  0.364742
1  Date1  0.178368  0.628677  0.364742
1  Date2  0.857469  0.705226  0.364742

1 个答案:

答案 0 :(得分:0)

要转换数据框,您可以stack(),然后reset_index()

代码:

df = df.stack().reset_index().rename(
    columns={'level_0': 'ID', 'level_1': 'Date'})
print(df)

测试数据:

csv_data = StringIO('\n'.join(x.strip() for x in u"""
    Group A,Group A,Group B,Group B,Group C,Group C
    Date1,Date2,Date1,Date2,Date1,Date2
    0.030626,0.494912,0.364742,0.320088,0.364742,0.364742
    0.178368,0.857469,0.628677,0.705226,0.364742,0.364742
""".split('\n')[1:-1]))

import pandas as pd
df = pd.read_csv(csv_data, sep=',', header=[0, 1])

<强>结果:

   ID   Date   Group A   Group B   Group C
0   0  Date1  0.030626  0.364742  0.364742
1   0  Date2  0.494912  0.320088  0.364742
2   1  Date1  0.178368  0.628677  0.364742
3   1  Date2  0.857469  0.705226  0.364742

测试数据2:

如果您的数据不是CSV格式,并且实际上格式化为您的问题:

data = StringIO(
u"""           Group A             Group B             Group C                 
    ID Date1     Date2     Date1     Date2     Date1     Date2
    0  0.030626  0.494912  0.364742  0.320088  0.364742  0.364742
    1  0.178368  0.857469  0.628677  0.705226  0.364742  0.364742
""")

df = pd.read_fwf(data, colspecs='infer', header=[0, 1]).rename(
    columns={'Unnamed: 0_level_0': 'ID',
             'Unnamed: 2_level_0': 'Group A',
             'Unnamed: 4_level_0': 'Group B',
             'Unnamed: 6_level_0': 'Group C'})
del df['ID']