我有一个csv
文件,如下所示:
Group A Group B Group C
ID Date1 Date2 Date1 Date2 Date1 Date2
0 0.030626 0.494912 0.364742 0.320088 0.364742 0.364742
1 0.178368 0.857469 0.628677 0.705226 0.364742 0.364742
如何将其读入数据框并进行数据透视,以便生成下表?
ID Date Group A Group B Group C
0 Date1 0.030626 0.364742 0.364742
0 Date2 0.494912 0.320088 0.364742
1 Date1 0.178368 0.628677 0.364742
1 Date2 0.857469 0.705226 0.364742
答案 0 :(得分:0)
要转换数据框,您可以stack()
,然后reset_index()
:
代码:
df = df.stack().reset_index().rename(
columns={'level_0': 'ID', 'level_1': 'Date'})
print(df)
测试数据:
csv_data = StringIO('\n'.join(x.strip() for x in u"""
Group A,Group A,Group B,Group B,Group C,Group C
Date1,Date2,Date1,Date2,Date1,Date2
0.030626,0.494912,0.364742,0.320088,0.364742,0.364742
0.178368,0.857469,0.628677,0.705226,0.364742,0.364742
""".split('\n')[1:-1]))
import pandas as pd
df = pd.read_csv(csv_data, sep=',', header=[0, 1])
<强>结果:强>
ID Date Group A Group B Group C
0 0 Date1 0.030626 0.364742 0.364742
1 0 Date2 0.494912 0.320088 0.364742
2 1 Date1 0.178368 0.628677 0.364742
3 1 Date2 0.857469 0.705226 0.364742
测试数据2:
如果您的数据不是CSV格式,并且实际上格式化为您的问题:
data = StringIO(
u""" Group A Group B Group C
ID Date1 Date2 Date1 Date2 Date1 Date2
0 0.030626 0.494912 0.364742 0.320088 0.364742 0.364742
1 0.178368 0.857469 0.628677 0.705226 0.364742 0.364742
""")
df = pd.read_fwf(data, colspecs='infer', header=[0, 1]).rename(
columns={'Unnamed: 0_level_0': 'ID',
'Unnamed: 2_level_0': 'Group A',
'Unnamed: 4_level_0': 'Group B',
'Unnamed: 6_level_0': 'Group C'})
del df['ID']