尝试将值分配给groupby对象的新列时,NotImplementedError

时间:2014-06-13 23:13:31

标签: python python-2.7 pandas

减法有效并返回Series,其索引与groupby对象相同(一年中的几个月,即1-12)。创建新列并将值分配给新列似乎会导致NotImplementedError

我试图从原始dataframe中的相应月份中减去12个月度值,即应从1月份的每个数据点中减去1(1月)的值等等。

test = df
grouped = test.groupby(test.index.month)
values_to_subtract = grouped['A'].median() - test['A'].median()
print values_to_subtract
grouped['new col'] = grouped['B'] - values_to_subtract
print grouped['new col']

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-226-9bff427dc855> in <module>()
      3 values_to_subtract = grouped['A'].median() - test['A'].median()
      4 print values_to_subtract
----> 5 grouped['new col'] = grouped['B'] - values_to_subtract
      6 print grouped['new col']

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\core\ops.pyc in wrapper(left, right, name)
    503             if hasattr(lvalues, 'values'):
    504                 lvalues = lvalues.values
--> 505             return left._constructor(wrap_results(na_op(lvalues, rvalues)),
    506                                      index=left.index, name=left.name,
    507                                      dtype=dtype)

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\core\ops.pyc in na_op(x, y)
    443         try:
    444             result = expressions.evaluate(op, str_rep, x, y,
--> 445                                           raise_on_error=True, **eval_kwargs)
    446         except TypeError:
    447             if isinstance(y, (pa.Array, pd.Series)):

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\computation\expressions.pyc in evaluate(op, op_str, a, b, raise_on_error, use_numexpr, **eval_kwargs)
    210     if use_numexpr:
    211         return _evaluate(op, op_str, a, b, raise_on_error=raise_on_error,
--> 212                          **eval_kwargs)
    213     return _evaluate_standard(op, op_str, a, b, raise_on_error=raise_on_error)
    214 

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\computation\expressions.pyc in _evaluate_standard(op, op_str, a, b, raise_on_error, **eval_kwargs)
     63     if _TEST_MODE:
     64         _store_test_result(False)
---> 65     return op(a, b)
     66 
     67 

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\core\ops.pyc in <lambda>(x, y)
     70         rmul=arith_method(operator.mul, names('rmul'), op('*'),
     71                           default_axis=default_axis, reversed=True),
---> 72         rsub=arith_method(lambda x, y: y - x, names('rsub'), op('-'),
     73                           default_axis=default_axis, reversed=True),
     74         rtruediv=arith_method(lambda x, y: operator.truediv(y, x),

C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\core\groupby.pyc in __getitem__(self, key)
    487 
    488     def __getitem__(self, key):
--> 489         raise NotImplementedError
    490 
    491     def _make_wrapper(self, name):

NotImplementedError: 

1    -3.40
2    -3.60
3    -5.30
4     0.15
5     1.80
6    -0.80
7     2.15
8     6.70
9     3.90
10    1.45
11   -0.75
12   -2.70
Name: A, dtype: float64

1 个答案:

答案 0 :(得分:1)

我想你想在这里使用transform

test['A'] - grouped['A'].transform("median")

这是一些奇怪的代码......

grouped = test.groupby(test.index.month)
values_to_subtract = grouped['A'].median() - test['A'].median()

现在,values_to_subtract是一个系列(假设只有一个'A'列,而grouped['B']是一个SeriesGrouby对象...减去它们没有意义!

grouped['B'] - values_to_subtract

此外,您无法将列分配给DataFrameGroupby对象,因此即使以上是一个系列:

grouped['new col'] = _
TypeError: 'DataFrameGroupBy' object does not support item assignment