在此上画一个空白,应该很容易。我有一个这样的小组:
09-2014 B 35.890847
C 32.416667
D 25.541903
10-2014 B 42.654209
C 36.350000
D 27.931034
11-2014 B 41.788443
C 35.784556
D 26.380000
12-2014 B 37.096036
C 34.545515
D 26.844368
希望最终得到像这样的DataFrame:
B C D
09-2014 35.890847 32.416667 25.541903
10-2014 42.654209 ...
我无法弄清楚如何以矢量化的方式转换这些数据,任何帮助??
谢谢你们
答案 0 :(得分:1)
import pandas as pd
# replicate your datastructure
# suppose your dataframe has these three columns with default integer index
# if your data is pd.Panel() type, then use to_frame() first to convert it to dataframe.
# ================================================================
dates = pd.date_range('2014-09-01', periods=4, freq='MS')
cat = 'B C D'.split()
multi_index = pd.MultiIndex.from_product([dates, cat], names=['dates', 'cat'])
df = pd.DataFrame(np.random.randn(12), columns=['vals'], index=multi_index).reset_index()
Out[57]:
dates cat vals
0 2014-09-01 B -1.1258
1 2014-09-01 C 0.9008
2 2014-09-01 D -0.1890
3 2014-10-01 B 0.8831
4 2014-10-01 C -0.2379
5 2014-10-01 D -0.1837
6 2014-11-01 B -0.4775
7 2014-11-01 C -0.6184
8 2014-11-01 D 0.6763
9 2014-12-01 B -2.0877
10 2014-12-01 C -0.3631
11 2014-12-01 D 0.3132
# processing
# ==========================================
df.set_index(['dates', 'cat']).unstack()
Out[58]:
vals
cat B C D
dates
2014-09-01 -1.1258 0.9008 -0.1890
2014-10-01 0.8831 -0.2379 -0.1837
2014-11-01 -0.4775 -0.6184 0.6763
2014-12-01 -2.0877 -0.3631 0.3132
答案 1 :(得分:1)
这似乎是DataFrame.unstack
的经典案例。
从示例中可以看出:
>>> index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'),
... ('two', 'a'), ('two', 'b')])
>>> s = pd.Series(np.arange(1.0, 5.0), index=index)
>>> s
one a 1
b 2
two a 3
b 4
dtype: float64
>>> s.unstack(level=-1)
a b
one 1 2
two 3 4