当长度> = 1,000,000时,pandas中的奇怪错误与范围索引

时间:2015-11-19 20:56:57

标签: python python-3.x pandas

当使用范围(x)将多个值分配给Series(或DataFrame)时,Pandas会引发ValueError,其中x> gt; 1.只有当长度为一百万或更大时才会出现此错误

import pandas as pd

for x in [5, 999999, 1000000]:
    s = pd.Series(index=range(x))
    print('series length =', len(s))
    # assigning value with range(1), always works
    s.loc[range(1)] = 42 
    # reading values with range(x>1), always works
    _ = s.loc[range(2)] 
    # assigning values with range(x>1), fails only len >= 1 million
    s.loc[range(2)] = 42  

输出:

series length = 5
series length = 999999
series length = 1000000
Traceback (most recent call last):
  File "<stdin>", line 9, in <module>
  File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 114, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 109, in _get_setitem_indexer
    return self._convert_to_indexer(key, is_setter=True)
  File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 1042, in _convert_to_indexer
    return labels.get_loc(obj)
  File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/index.py", line 1692, in get_loc
    return self._engine.get_loc(_values_from_object(key))
  File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
  File "pandas/index.pyx", line 145, in pandas.index.IndexEngine.get_loc (pandas/index.c:3680)
  File "pandas/index.pyx", line 464, in pandas.index._bin_search (pandas/index.c:9124)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

我使用python 3.4和pandas 0.17.0。此行为似乎尚未报告。大熊猫在长度>&= 1,000,000?

的系列上做了什么特别的事

0 个答案:

没有答案