手动分配类别的最佳方法

时间:2019-11-25 05:20:19

标签: python pandas dataframe categories

假设我有一个数据框,其值如下:


Food
----
Turkey
Tomato
Rice
Chicken
Lettuce

我想添加一个类别,使其类似于:

Food        Category
----        ----
Turkey      Meat
Tomato      Vegetable
Rice        Grain
Chicken     Meat
Lettuce     Vegetable

但是实际上我有〜100个不同的值,我想将其分类为〜10个组,而我想手动进行。

我一直在尝试直接编写脚本,而不是链接数据库或电子表格。到目前为止,我一直在尝试的内容以及错误代码都印在下面,但同时也想知道是否有更好的方法来实现这一目标?

当前代码:

df.loc[df.Food.any(
    [
    'Turkey'
    ,'Chicken'
]
)
         , 'Category'] = 'Meat' 

df.loc[df.Food.any(
    [
    'Tomato'
    ,'Lettuce'
]
)
         , 'Category'] = 'Vegetable' 

错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-41349bcd38a0> in <module>
     41     ]
     42 )
---> 43          , 'Category'] = 'Meat' 

~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\generic.py in logical_func(self, axis, bool_only, skipna, level, **kwargs)
  11721             skipna=skipna,
  11722             numeric_only=bool_only,
> 11723             filter_type="bool",
  11724         )
  11725 

~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   4061 
   4062         if axis is not None:
-> 4063             self._get_axis_number(axis)
   4064 
   4065         if isinstance(delegate, Categorical):

~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\generic.py in _get_axis_number(cls, axis)
    400     @classmethod
    401     def _get_axis_number(cls, axis):
--> 402         axis = cls._AXIS_ALIASES.get(axis, axis)
    403         if is_integer(axis):
    404             if axis in cls._AXIS_NAMES:

TypeError: unhashable type: 'list'

1 个答案:

答案 0 :(得分:1)

我建议您将映射值存储在字典中,其中将类别作为键,将与该类别对应的选项列表作为值,例如:

mapping = {'Meat': ['Turkey','Chicken'], 'Vegetable': ['Tomato','Lettuce'], 'Grain': ['Rice']}

然后您可以使用pd.Series.map

df['Category'] = df['Food'].map({i: k for k, v in mapping.items() for i in v})

收益:

      Food   Category
0   Turkey       Meat
1   Tomato  Vegetable
2     Rice      Grain
3  Chicken       Meat
4  Lettuce  Vegetable