在pandas索引对象上运行内置方法的奇怪错误

时间:2014-03-01 07:12:30

标签: python numpy indexing pandas

我正在尝试提取与数据框的特定行中的最大值对应的列名。这是我的总体战略:

best_MAP = df.ix['map',].max()  # Identify the value
ix = df.ix['map',] == best_MAP  # build a boolean vector to select the element corresponding to this value
best_param = df.columns[ix] # pull that element out

(我在这里展示这一切以防万一有更好的方法)

结果,best_param是一个pandas Index对象。为了创建一个可重现的示例,这种情况会抛出我在下面描述的错误:

best_param = pd.core.index.Index([.19, .20])

如果行中有唯一的最大值,那么这个索引只是一个元素,我可以通过

提取我需要的float64
best_param[0]

但是我遇到过'map'行中有多个记录且具有相同值(最大值)的情况。我注意到Index对象(包含相应的列标识符)同时具有.min().max()方法,所以我认为这可能是提取单个值的好方法。以下任何一个在Ipython中抛出一个重复错误,最终以“达到递归限制”的方式退出:

best_param.min()
best_param.max()
np.min(best_param)
np.max(best_param)

以下是我收到的疯狂错误的摘录:

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in as_esca
ped_unicode(thing, escape_chars)
   2026
   2027         try:
-> 2028             result = unicode(thing)  # we should try this first
   2029         except UnicodeDecodeError:
   2030             # either utf-8 or we replace errors

C:\Users\user\Anaconda\lib\site-packages\pandas\core\index.pyc in __unicod
e__(self)
    149         Invoked by unicode(df) in py2 only. Yields a Unicode String in b
oth py2/py3.
    150         """
--> 151         prepr = com.pprint_thing(self, escape_chars=('\t', '\r', '\n'),q
uote_strings=True)
    152         return '%s(%s, dtype=%s)' % (type(self).__name__, prepr, self.dt
ype)
    153

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in pprint_
thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings)
   2065         result = fmt % as_escaped_unicode(thing)
   2066     else:
-> 2067         result = as_escaped_unicode(thing)
   2068
   2069     return unicode(result)  # always unicode

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in as_esca
ped_unicode(thing, escape_chars)
   2026
   2027         try:
-> 2028             result = unicode(thing)  # we should try this first
   2029         except UnicodeDecodeError:
   2030             # either utf-8 or we replace errors

C:\Users\user\Anaconda\lib\site-packages\pandas\core\index.pyc in __unicod
e__(self)
    149         Invoked by unicode(df) in py2 only. Yields a Unicode String in b
oth py2/py3.
    150         """
--> 151         prepr = com.pprint_thing(self, escape_chars=('\t', '\r', '\n'),q
uote_strings=True)
    152         return '%s(%s, dtype=%s)' % (type(self).__name__, prepr, self.dt
ype)
    153

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in pprint_
thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings)
   2065         result = fmt % as_escaped_unicode(thing)
   2066     else:
-> 2067         result = as_escaped_unicode(thing)
   2068
   2069     return unicode(result)  # always unicode

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in as_esca
ped_unicode(thing, escape_chars)
   2026
   2027         try:
-> 2028             result = unicode(thing)  # we should try this first
   2029         except UnicodeDecodeError:
   2030             # either utf-8 or we replace errors

C:\Users\user\Anaconda\lib\site-packages\pandas\core\index.pyc in __unicod
e__(self)
    149         Invoked by unicode(df) in py2 only. Yields a Unicode String in b
oth py2/py3.
    150         """
--> 151         prepr = com.pprint_thing(self, escape_chars=('\t', '\r', '\n'),q
uote_strings=True)
    152         return '%s(%s, dtype=%s)' % (type(self).__name__, prepr, self.dt
ype)
    153

C:\Users\user\Anaconda\lib\site-packages\pandas\core\common.pyc in pprint_
thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings)
   2051             hasattr(thing, 'next'):
   2052         return unicode(thing)
-> 2053     elif (isinstance(thing, dict) and
   2054           _nest_lvl < get_option("display.pprint_nest_depth")):
   2055         result = _pprint_dict(thing, _nest_lvl,quote_strings=True)

RuntimeError: maximum recursion depth exceeded in __instancecheck__

这到底是怎么回事?我猜这是熊猫的一个错误,但历史上每当我责备我的工具时,我发现它通常是我,而不是工具,这就是问题所在。是否可能与列号ID使用非整数数字有关?

这是在带有以下软件版本的Ipython控制台中使用Anaconda发行版:

  • Pandas 0.12.0
  • Numpy 1.7.1
  • Ipython 2.7.5
  • Anaconda 1.8.0

1 个答案:

答案 0 :(得分:3)

解决方法是从numpy值中获取最大值/分钟:

In [11]: ind = pd.Index([.19, .20])

In [21]: ind.values.max()
Out[21]: 0.2

这是一个错误(存在于0.13.1中),但fixed in master ...