这应该很快,但是我正在做的枢轴/组合工作都没有提出我需要的东西。
我有一张这样的表:
Letter Period Amount
YrMnth
2014-12 B 6 0
2014-12 C 8 1
2014-12 C 9 2
2014-12 C 10 3
2014-12 C 6 4
2014-12 C 12 5
2014-12 C 7 6
2014-12 C 11 7
2014-12 D 9 8
2014-12 D 10 9
2014-12 D 1 10
2014-12 D 8 11
2014-12 D 6 12
2014-12 D 12 13
2014-12 D 7 14
2014-12 D 11 15
2014-12 D 4 16
2014-12 D 3 17
2015-01 B 7 18
2015-01 B 8 19
2015-01 B 1 20
2015-01 B 10 21
2015-01 B 11 22
2015-01 B 6 23
2015-01 B 9 24
2015-01 B 3 25
2015-01 B 5 26
2015-01 C 10 27
我想转动它,以便索引基本上是YrMonth和Letter,Period是列,Amount是值。
我一般都理解Pivot,但是当我尝试使用多个索引时,我会遇到错误。我把索引作为一个列,并尝试了这个:
In [76]: df.pivot(index=['YrMnth','Letter'], values='Amount', columns='Period')
但我出现了这个错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-76-fc2a4c5f244d> in <module>()
----> 1 df.pivot(index=['YrMnth','Letter'], values='Amount', columns='Period')
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in pivot(self, index, columns, values)
3761 """
3762 from pandas.core.reshape import pivot
-> 3763 return pivot(self, index=index, columns=columns, values=values)
3764
3765 def stack(self, level=-1, dropna=True):
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/reshape.pyc in pivot(self, index, columns, values)
331 indexed = Series(self[values].values,
332 index=MultiIndex.from_arrays([index,
--> 333 self[columns]]))
334 return indexed.unstack(columns)
335
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
225 raise_cast_failure=True)
226
--> 227 data = SingleBlockManager(data, index, fastpath=True)
228
229 generic.NDFrame.__init__(self, data, fastpath=True)
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
3734 block = make_block(block,
3735 placement=slice(0, len(axis)),
-> 3736 ndim=1, fastpath=True)
3737
3738 self.blocks = [block]
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
2452
2453 return klass(values, ndim=ndim, fastpath=fastpath,
-> 2454 placement=placement)
2455
2456
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, placement, ndim, fastpath)
85 raise ValueError('Wrong number of items passed %d,'
86 ' placement implies %d' % (
---> 87 len(self.values), len(self.mgr_locs)))
88
89 @property
ValueError: Wrong number of items passed 138, placement implies 2
答案 0 :(得分:2)
如果我理解正确,pivot_table
可能更接近您的需要:
df = df.pivot_table(index=["YrMnth", "Letter"], columns="Period", values="Amount")
这给了你:
Period 1 3 4 5 6 7 8 9 10 11 12
YrMnth Letter
2014-12 B NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN 4 6 1 2 3 7 5
D 10 17 16 NaN 12 14 11 8 9 15 13
2015-01 B 20 25 NaN 26 23 18 19 24 21 22 NaN
C NaN NaN NaN NaN NaN NaN NaN NaN 27 NaN NaN
正如评论中所建议的那样:
df = pd.pivot_table(df, index=["YrMnth", "Letter"], columns="Period", values="Amount")
Period 1 3 4 5 6 7 8 9 10 11 12
YrMnth Letter
2014-12 B NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN 4 6 1 2 3 7 5
D 10 17 16 NaN 12 14 11 8 9 15 13
2015-01 B 20 25 NaN 26 23 18 19 24 21 22 NaN
C NaN NaN NaN NaN NaN NaN NaN NaN 27 NaN NaN
如果有人想澄清前者如何失败,那么也会产生相同的效果。