根据熊猫数据框中的两个条件创建一列布尔值

时间:2019-05-31 14:47:06

标签: python pandas

我正在尝试根据一列是否包含“危险”一词且不包含“屋顶”一词来创建一列布尔值(因此我得到了所有非屋顶危险)。

我正在使用以下代码,但出现错误:

labels['h_count2'] = labels[(labels['Description'].str.contains('Hazard')) & (labels['Description'].str.contains('Roof'))]

这是回溯:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'h_count2'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\managers.py in set(self, item, value)
   1052         try:
-> 1053             loc = self.items.get_loc(item)
   1054         except KeyError:

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'h_count2'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-46-51360ea6f27f> in <module>
      1 labels['h_count'] = labels['Description'].str.contains('Roof Hazard')
      2 labels['b_count'] = labels['Description'].str.contains('Brush')
----> 3 labels['h_count2'] = labels[(labels['Description'].str.contains('Hazard')) & (labels['Description'].str.contains('Roof'))]
      4 
      5 def target(row):

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
   3368         else:
   3369             # set column
-> 3370             self._set_item(key, value)
   3371 
   3372     def _setitem_slice(self, key, value):

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
   3444         self._ensure_valid_index(value)
   3445         value = self._sanitize_column(key, value)
-> 3446         NDFrame._set_item(self, key, value)
   3447 
   3448         # check if we are modifying a copy

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\generic.py in _set_item(self, key, value)
   3170 
   3171     def _set_item(self, key, value):
-> 3172         self._data.set(key, value)
   3173         self._clear_item_cache()
   3174 

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\managers.py in set(self, item, value)
   1054         except KeyError:
   1055             # This item wasn't present, just insert at end
-> 1056             self.insert(len(self.items), item, value)
   1057             return
   1058 

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\managers.py in insert(self, loc, item, value, allow_duplicates)
   1156 
   1157         block = make_block(values=value, ndim=self.ndim,
-> 1158                            placement=slice(loc, loc + 1))
   1159 
   1160         for blkno, count in _fast_count_smallints(self._blknos[loc:]):

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\blocks.py in make_block(values, placement, klass, ndim, dtype, fastpath)
   3093         values = DatetimeArray._simple_new(values, dtype=dtype)
   3094 
-> 3095     return klass(values, ndim=ndim, placement=placement)
   3096 
   3097 

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\blocks.py in __init__(self, values, placement, ndim)
   2629 
   2630         super(ObjectBlock, self).__init__(values, ndim=ndim,
-> 2631                                           placement=placement)
   2632 
   2633     @property

C:\ProgramData\Anaconda3\envs\tensorflowenvironment\lib\site-packages\pandas\core\internals\blocks.py in __init__(self, values, placement, ndim)
     85             raise ValueError(
     86                 'Wrong number of items passed {val}, placement implies '
---> 87                 '{mgr}'.format(val=len(self.values), mgr=len(self.mgr_locs)))
     88 
     89     def _check_ndim(self, values, ndim):

ValueError: Wrong number of items passed 5, placement implies 1

我在做什么错?

2 个答案:

答案 0 :(得分:1)

labels = pd.DataFrame({'Description': ['Hazard Roof test', 'test', 'Hazard is not', 'test2']})

labels['h_count2'] = (labels['Description'].str.upper().str.contains('HAZARD')) & ~(labels['Description'].str.upper().str.contains('ROOF'))

    Description        h_count2
0   Hazard Roof test    False
1   test                False
2   Hazard is not       True
3   test2               False

答案 1 :(得分:1)

标签:

   A  Description
0  1        Roof 
1  2       Hazard
2  3  Roof Hazard

labels['h_count2'] = labels.Description.str.contains('Hazard') & ~labels.Description.str.contains('Roof')

结果

   A  Description  h_count2
0  1        Roof      False
1  2       Hazard      True
2  3  Roof Hazard     False