假设我有一个数据框,其值如下:
Food
----
Turkey
Tomato
Rice
Chicken
Lettuce
我想添加一个类别,使其类似于:
Food Category
---- ----
Turkey Meat
Tomato Vegetable
Rice Grain
Chicken Meat
Lettuce Vegetable
但是实际上我有〜100个不同的值,我想将其分类为〜10个组,而我想手动进行。
我一直在尝试直接编写脚本,而不是链接数据库或电子表格。到目前为止,我一直在尝试的内容以及错误代码都印在下面,但同时也想知道是否有更好的方法来实现这一目标?
当前代码:
df.loc[df.Food.any(
[
'Turkey'
,'Chicken'
]
)
, 'Category'] = 'Meat'
df.loc[df.Food.any(
[
'Tomato'
,'Lettuce'
]
)
, 'Category'] = 'Vegetable'
错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-41349bcd38a0> in <module>
41 ]
42 )
---> 43 , 'Category'] = 'Meat'
~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\generic.py in logical_func(self, axis, bool_only, skipna, level, **kwargs)
11721 skipna=skipna,
11722 numeric_only=bool_only,
> 11723 filter_type="bool",
11724 )
11725
~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
4061
4062 if axis is not None:
-> 4063 self._get_axis_number(axis)
4064
4065 if isinstance(delegate, Categorical):
~\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\generic.py in _get_axis_number(cls, axis)
400 @classmethod
401 def _get_axis_number(cls, axis):
--> 402 axis = cls._AXIS_ALIASES.get(axis, axis)
403 if is_integer(axis):
404 if axis in cls._AXIS_NAMES:
TypeError: unhashable type: 'list'
答案 0 :(得分:1)
我建议您将映射值存储在字典中,其中将类别作为键,将与该类别对应的选项列表作为值,例如:
mapping = {'Meat': ['Turkey','Chicken'], 'Vegetable': ['Tomato','Lettuce'], 'Grain': ['Rice']}
然后您可以使用pd.Series.map
:
df['Category'] = df['Food'].map({i: k for k, v in mapping.items() for i in v})
收益:
Food Category
0 Turkey Meat
1 Tomato Vegetable
2 Rice Grain
3 Chicken Meat
4 Lettuce Vegetable