将Pandas面板转换为列

时间:2015-07-10 20:00:33

标签: python pandas

在此上画一个空白,应该很容易。我有一个这样的小组:

09-2014    B       35.890847
           C       32.416667
           D       25.541903
10-2014    B       42.654209
           C       36.350000
           D       27.931034
11-2014    B       41.788443
           C       35.784556
           D       26.380000
12-2014    B       37.096036
           C       34.545515
           D       26.844368

希望最终得到像这样的DataFrame:

            B         C        D
09-2014 35.890847 32.416667 25.541903
10-2014 42.654209 ...

我无法弄清楚如何以矢量化的方式转换这些数据,任何帮助??

谢谢你们

2 个答案:

答案 0 :(得分:1)

import pandas as pd

# replicate your datastructure
# suppose your dataframe has these three columns with default integer index
# if your data is pd.Panel() type, then use to_frame() first to convert it to dataframe.
# ================================================================
dates = pd.date_range('2014-09-01', periods=4, freq='MS')
cat = 'B C D'.split()
multi_index = pd.MultiIndex.from_product([dates, cat], names=['dates', 'cat'])
df = pd.DataFrame(np.random.randn(12), columns=['vals'], index=multi_index).reset_index()

Out[57]: 
        dates cat    vals
0  2014-09-01   B -1.1258
1  2014-09-01   C  0.9008
2  2014-09-01   D -0.1890
3  2014-10-01   B  0.8831
4  2014-10-01   C -0.2379
5  2014-10-01   D -0.1837
6  2014-11-01   B -0.4775
7  2014-11-01   C -0.6184
8  2014-11-01   D  0.6763
9  2014-12-01   B -2.0877
10 2014-12-01   C -0.3631
11 2014-12-01   D  0.3132

# processing
# ==========================================
df.set_index(['dates', 'cat']).unstack()

Out[58]: 
              vals                
cat              B       C       D
dates                             
2014-09-01 -1.1258  0.9008 -0.1890
2014-10-01  0.8831 -0.2379 -0.1837
2014-11-01 -0.4775 -0.6184  0.6763
2014-12-01 -2.0877 -0.3631  0.3132

答案 1 :(得分:1)

这似乎是DataFrame.unstack的经典案例。

从示例中可以看出:

>>> index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'),
...                                    ('two', 'a'), ('two', 'b')])
>>> s = pd.Series(np.arange(1.0, 5.0), index=index)
>>> s
one  a   1
     b   2
two  a   3
     b   4
dtype: float64
>>> s.unstack(level=-1)
     a   b
one  1  2
two  3  4