编辑(2015年5月19日):我刚刚确认此版本已于0.16.1版本修复,因此在最新版本中这不应该是一个问题。
这些都应该给出相同的结果,对吗?
df.groupby(level=0).transform('mean')
df.groupby(level=0)['x'].transform(np.nanmean)
df.groupby(level=0)['x'].transform('mean')
前两个是好的,但第三个不起作用。可能是一个错误?
df = pd.DataFrame({ 'x':[1,np.nan,3,4] }, index=[1,1,2,2],)
df
Out[686]:
x
1 1
1 NaN
2 3
2 4
df.groupby(level=0).transform('mean')
Out[687]:
x
1 1.0
1 1.0
2 3.5
2 3.5
df.groupby(level=0)['x'].transform(np.nanmean)
Out[688]:
1 1.0
1 1.0
2 3.5
2 3.5
Name: x, dtype: float64
这一切都很好,但不是这样:
df.groupby(level=0)['x'].transform('mean')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-691-24761ee742fd> in <module>()
----> 1 df.groupby(level=0)['x'].transform('mean')
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in transform(self, func, *args, **kwargs)
2411 # if string function
2412 if isinstance(func, compat.string_types):
-> 2413 return self._transform_fast(lambda : getattr(self, func)(*args, **kwargs))
2414
2415 # do we have a cython function
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in _transform_fast(self, func)
2457 values = np.repeat(values, com._ensure_platform_int(counts))
2458
-> 2459 return self._set_result_index_ordered(Series(values))
2460
2461 def filter(self, func, dropna=True, *args, **kwargs):
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in _set_result_index_ordered(self, result)
495 result = result.sort_index()
496
--> 497 result.index = self.obj.index
498 return result
499
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\generic.pyc in __setattr__(self, name, value)
1978 try:
1979 object.__getattribute__(self, name)
-> 1980 return object.__setattr__(self, name, value)
1981 except AttributeError:
1982 pass
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.AxisProperty.__set__ (pandas\lib.c:38795)()
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\series.pyc in _set_axis(self, axis, labels, fastpath)
266 object.__setattr__(self, '_index', labels)
267 if not fastpath:
--> 268 self._data.set_axis(axis, labels)
269
270 def _set_subtyp(self, is_all_dates):
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\internals.pyc in set_axis(self, axis, new_labels)
2209 if new_len != old_len:
2210 raise ValueError('Length mismatch: Expected axis has %d elements, '
-> 2211 'new values have %d elements' % (old_len, new_len))
2212
2213 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements
答案 0 :(得分:0)
我已经确认在版本0.16.1中确实已经修复了这个问题。请参阅@DSM和@AndyHayden上面的评论。