给出以下数据:
Symbol Date Type Value
518 ZW 2008-01-02 cm 1.204330e+09
519 ZW 2008-01-02 cm_next 1.209600e+09
520 ZW 2008-01-02 p&l 0.000000e+00
521 ZW 2008-01-02 position 0.000000e+00
522 ZW 2008-01-02 rolldate 1.203466e+09
523 ZW 2008-01-02 value 3.114788e+04
524 ZW 2008-01-02 vola 6.256606e+02
1046 ZW 2008-01-03 cm 1.204330e+09
1047 ZW 2008-01-03 cm_next 1.209600e+09
1048 ZW 2008-01-03 p&l 0.000000e+00
1049 ZW 2008-01-03 position 0.000000e+00
1050 ZW 2008-01-03 rolldate 1.203466e+09
1051 ZW 2008-01-03 value 3.202738e+04
1052 ZW 2008-01-03 vola 6.338274e+02
1574 ZW 2008-01-04 cm 1.204330e+09
1575 ZW 2008-01-04 cm_next 1.209600e+09
1576 ZW 2008-01-04 p&l 0.000000e+00
1577 ZW 2008-01-04 position 0.000000e+00
1578 ZW 2008-01-04 rolldate 1.203466e+09
1579 ZW 2008-01-04 value 3.162559e+04
1580 ZW 2008-01-04 vola 6.357563e+02
2102 ZW 2008-01-07 cm 1.204330e+09
2103 ZW 2008-01-07 cm_next 1.209600e+09
2104 ZW 2008-01-07 p&l 0.000000e+00
2105 ZW 2008-01-07 position 0.000000e+00
2106 ZW 2008-01-07 rolldate 1.203466e+09
2107 ZW 2008-01-07 value 3.066630e+04
2108 ZW 2008-01-07 vola 6.381839e+02
我希望reshape
此表格格式如下:
Symbol | Date | cm | cm_next | rolldate | p&l | position | [etc..]
即。我的所有类型都应该是列,并包含每个日期的各自值。
我试过了df.pivot()
& df.unstack()
但是,根据我的理解,我想要的是超出他们的范围,而不是我正在寻找的东西。
我可以在Type
列中提取每种类型的数据并将其粘合在一起 - 但这似乎是一种相当原始的方法。
是否有更好,更pandaic
的方法来实现这一目标?
答案 0 :(得分:1)
我认为您需要pivot_table
,但数据由np.mean
(默认aggfunc=np.mean
)与rename_axis
汇总(pandas
0.18.0
中的新内容)和reset_index
:
print df.pivot_table(index=['Symbol','Date'], columns='Type', values='Value')
.rename_axis(None, axis=1)
.reset_index()
Symbol Date cm cm_next p&l position rolldate \
0 ZW 2008-01-02 1.204330e+09 1.209600e+09 0.0 0.0 1.203466e+09
1 ZW 2008-01-03 1.204330e+09 1.209600e+09 0.0 0.0 1.203466e+09
2 ZW 2008-01-04 1.204330e+09 1.209600e+09 0.0 0.0 1.203466e+09
3 ZW 2008-01-07 1.204330e+09 1.209600e+09 0.0 0.0 1.203466e+09
value vola
0 31147.88 625.6606
1 32027.38 633.8274
2 31625.59 635.7563
3 30666.30 638.1839