你能帮我改一下Pandas DataFrame:
df = pd.DataFrame({
'Clasif1': [np.NaN, np.NaN, 'PRE', 'POST'],
'Currency': [np.NaN, np.NaN, 'LC', 'USD'],
'Unnamed: 1': ['A','01/01/2018',1,7],
'Unnamed: 2': ['A','02/01/2018',2,8],
'Unnamed: 3': ['A','03/01/2018',3,9],
'Unnamed: 4': ['B','01/01/2018',4,10],
'Unnamed: 5': ['B','02/01/2018',5,11],
'Unnamed: 6': ['B','03/01/2018',6,12]
})
对此:
df_result = pd.DataFrame({
'Clasif1': ['PRE', 'POST','PRE', 'POST','PRE', 'POST','PRE', 'POST','PRE', 'POST','PRE', 'POST'],
'Currency': ['LC', 'USD','LC', 'USD','LC', 'USD','LC', 'USD','LC', 'USD','LC', 'USD'],
'A/B': ['A','A','A','A','A','A','B','B','B','B','B','B'],
'Date': ['01/01/2018','01/01/2018','02/01/2018','02/01/2018','03/01/2018','03/01/2018','01/01/2018','01/01/2018','02/01/2018','02/01/2018','03/01/2018','03/01/2018'],
'Value': [1,7,2,8,3,9,4,10,5,11,6,12]
})
结果DataFrame行顺序不需要与预期匹配。
感谢您的帮助,
答案 0 :(得分:1)
这更像是自定义解决方案,但如果您可以确保数据类似于此结构,则可以使用stack
s=df.iloc[:,2:].T.set_index([0,1]).stack()
s=s.to_frame('V').reset_index(level=[0,1])
s=s.join(df.iloc[:,:2]).sort_values([0,1])
s
Out[226]:
0 1 V Clasif1 Currency
2 A 01/01/2018 1 PRE LC
3 A 01/01/2018 7 POST USD
2 A 02/01/2018 2 PRE LC
3 A 02/01/2018 8 POST USD
2 A 03/01/2018 3 PRE LC
3 A 03/01/2018 9 POST USD
2 B 01/01/2018 4 PRE LC
3 B 01/01/2018 10 POST USD
2 B 02/01/2018 5 PRE LC
3 B 02/01/2018 11 POST USD
2 B 03/01/2018 6 PRE LC
3 B 03/01/2018 12 POST USD
答案 1 :(得分:1)
同上,@在这个问题中,这是一个非常自定义的数据框解决方案:
colindx=pd.MultiIndex.from_arrays(df.iloc[0:2,2:].values)
df_out = df.iloc[2:,:].set_index(['Clasif1','Currency'])
df_out.columns = colindx
df_out.reset_index().melt(id_vars=['Clasif1','Currency'])
输出:
Clasif1 Currency variable_0 variable_1 value
0 PRE LC A 01/01/2018 1
1 POST USD A 01/01/2018 7
2 PRE LC A 02/01/2018 2
3 POST USD A 02/01/2018 8
4 PRE LC A 03/01/2018 3
5 POST USD A 03/01/2018 9
6 PRE LC B 01/01/2018 4
7 POST USD B 01/01/2018 10
8 PRE LC B 02/01/2018 5
9 POST USD B 02/01/2018 11
10 PRE LC B 03/01/2018 6
11 POST USD B 03/01/2018 12