Question

我有一个具有以下结构的大型df：

data = pd.DataFrame({'a': ['red', 'blue', 'green', 'cat', 'dog'],
                     'b': [1, 1, 2, 3, 3]})

我有一个字典，可以像这样分配类别：

    category_dict = {'red': ['color'],
 'blue': ['color'],
 'green': ['color'],
 'cat': ['animal'],
 'dog': ['animal']}

我想使用字典创建具有以下类别的另一列：

data_update = pd.DataFrame({'a': ['red', 'blue', 'green', 'cat', 'dog'],
                     'b': [1, 1, 2, 3, 3],
                    'c': ['color', 'color', 'color', 'animal', 'animal']})

我以为data['c'] = category_dict[data['a']]会提供我的输出，但是却出现错误'Series' objects are mutable, thus they cannot be hashed

Answer 1

使用此：

data['c'] = [category_dict[x][0] for x in list(data['a'])]

Answer 2

尝试：

flatten_dict = {k:v[0] for k,v in category_dict.items()}

data['c'] = data['a'].map(flatten_dict)

熊猫：根据另一列和字典的类别添加新列

2 个答案: