我有这种数据框:
Variable Date Value
0 Variable1 Date1 Valeur 1
1 Variable1 Date2 Valeur 2
2 Variable1 Date3 Valeur 3
3 Variable2 Date4 Valeur 4
4 Variable2 Date5 Valeur 5
我想像这样改造它:
Date Variable1 Variable2
0 Date1 Valeur 1 None
1 Date2 Valeur 2 None
2 Date3 Valeur 3 None
3 Date4 None Valeur 4
4 Date5 None Valeur 5
如何使用panda或numpy在Python中进行这种转换? 谢谢你的帮助
答案 0 :(得分:4)
我认为您需要pivot
rename_axis
(pandas
0.18.0
中的新内容)和reset_index
:
print df.pivot(index='Date', columns='Variable', values='Value')
.rename_axis(None, axis=1)
.reset_index()
Date Variable1 Variable2
0 Date1 Valeur 1 None
1 Date2 Valeur 2 None
2 Date3 Valeur 3 None
3 Date4 None Valeur 4
4 Date5 None Valeur 5
样品:
import pandas as pd
df = pd.DataFrame({'Variable': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b'},
'Date': {0: pd.Timestamp('2016-02-05 00:00:00'),
1: pd.Timestamp('2016-02-06 00:00:00'),
2: pd.Timestamp('2016-02-07 00:00:00'),
3: pd.Timestamp('2016-02-08 00:00:00'),
4: pd.Timestamp('2016-02-09 00:00:00')},
'Value': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}},
columns=['Variable','Date','Value'])
print df
Variable Date Value
0 a 2016-02-05 1
1 a 2016-02-06 2
2 a 2016-02-07 3
3 b 2016-02-08 4
4 b 2016-02-09 5
print df.pivot(index='Date', columns='Variable', values='Value')
.rename_axis(None, axis=1)
.reset_index()
Date a b
0 2016-02-05 1.0 NaN
1 2016-02-06 2.0 NaN
2 2016-02-07 3.0 NaN
3 2016-02-08 NaN 4.0
4 2016-02-09 NaN 5.0
答案 1 :(得分:0)
作为补充,一种在条件下拆分列的方法:
df=pd.DataFrame({'Variable':arange(5)},index=pd.date_range('2016/4/26',periods=5))
"""
Variable
2016-04-26 0
2016-04-27 1
2016-04-28 2
2016-04-29 3
2016-04-30 4
"""
cond=df<3
df[cond].join(df[~cond],lsuffix=1,rsuffix=2)
"""
Variable1 Variable2
2016-04-26 0.0 NaN
2016-04-27 1.0 NaN
2016-04-28 2.0 NaN
2016-04-29 NaN 3.0
2016-04-30 NaN 4.0
"""