Pandas - 使用String相等性选择DataFrame中的行

时间:2017-03-06 05:04:11

标签: python pandas

我正在尝试从占用退役的DataFrame contributors获取所有行,如下所示:

mask = (contributors.contbr_occupation.str == 'RETIRED')
print(contributors[mask])

但是,我得到以下堆栈跟踪:

Traceback (most recent call last):
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\indexes\base.py", line 2134, in get_loc
    return self._engine.get_loc(key)
  File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433)
  File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279)
  File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742)
  File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)
KeyError: False

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "census_attack.py", line 27, in <module>
    print(contributors[mask])
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\core\frame.py", line 2059, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\core\frame.py", line 2066, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\core\generic.py", line 1386, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\core\internals.py", line 3543, in get
    loc = self.items.get_loc(item)
  File "C:\Users\Me\Anaconda3\envs\pandas\lib\site-packages\pandas\indexes\base.py", line 2136, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433)
  File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279)
  File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742)
  File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)
KeyError: False

我该怎么做?

2 个答案:

答案 0 :(得分:2)

您可以使用query

contributors.query('contbr_occupation == "RETIRED"')

答案 1 :(得分:1)

如果您只是执行真正的平等检查(不包含或类似内容),请不要使用str访问者 - 您不需要它。

mask = (contributors.contbr_occupation == 'RETIRED')

示例

>>> df

  strings
0     abc
1     def
2     ghi
3     abc

>>> df[df.strings == 'abc']

  strings
0     abc
3     abc

如果确实需要某些逻辑条件,例如包含,请在str访问器上调用字符串方法,例如使用str.contains

mask = (contributors.contbr_occupation.str.contains('RETIRED'))