对于整数而不是浮点数的NaN组的pandas KeyError

时间:2015-04-17 12:33:25

标签: python python-2.7 pandas

(pandas版本0.16.0,numpy版本1.9.2)

我试图在列中存储值,并找到原始数据中与每个bin的最大值对应的行。

我找到了一种方法来实现这一点,并且该方法正在处理一些浮点数据,但不是在int数据上:

>>> from pandas import *
>>> df1 = DataFrame({"id": range(3),"a": np.random.random(3)})
>>> df2 = DataFrame({"id": range(3),"a": [0,1,5]})
>>> bins = [0,1,2]
>>> grouped1 = df1.a.groupby(cut(df1.a,bins))
>>> grouped2 = df2.a.groupby(cut(df2.a,bins))
>>> idx1 = grouped1.transform(max) == df1.a
>>> df1[idx1]
           a  id
0  0.997843  0
>>> idx2 = grouped2.transform(max) == df2.a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/pandas/core/groupby.py", line 2418, in transform
    return self._transform_fast(cyfunc)
  File "/usr/lib/python2.7/site-packages/pandas/core/groupby.py", line 2459, in _transform_fast
    return self._set_result_index_ordered(Series(values))
  File "/usr/lib/python2.7/site-packages/pandas/core/groupby.py", line 493, in _set_result_index_ordered
    index = Index(np.concatenate([ indices[v] for v in self.grouper.result_index ]))
KeyError: '(1, 2]'

请注意,两个组都会获得带有这些区域的NaN行:

>>> grouped1.max()
a
(0, 1]    0.859684
(1, 2]         NaN
Name: a, dtype: float64
>>> grouped2.max()
a
(0, 1]     1
(1, 2]   NaN
Name: a, dtype: float64

我无法理解问题所在。具有bin值的KeyError对我来说没有多大意义。

0 个答案:

没有答案