Question

在以下代码中：

import pandas as pd
import numpy as np
import random

sz = 50
df = pd.DataFrame({'Group': pd.Series(random.choice(['A', 'B']) for _ in range(sz)),
              'Key': pd.Series(np.random.randint(2, high=5, size=sz))})

dictforA = {2: 0.1, 3: 0.8, 4: 0.2}
dictforB = {3: 0.9}

...想要分配一个名为Value的新列，该列基于其各自的字典。缺少的值为NaN。

代码： df.assign(Value=df.groupby('Group').apply(lambda x: np.where(x.index == 'A', dictforA[x.Key], dictforB[x.Key])))

给予

TypeError: 'Series' objects are mutable, thus they cannot be hashed

我要去哪里错了？

Answer 1

您可以创建一个从Group到字典的映射器，并使用pd.Series.map

mapper = {'A': dictforA, 
          'B': dictforB}

df['Value'] = df.groupby('Group').Key.apply(lambda s: s.map(mapper[s.name]))

>>> print(df.head(10))

  Group  Key  Value
0     B    3    0.9
1     A    2    0.1
2     B    3    0.9
3     A    3    0.8
4     A    3    0.8
5     A    2    0.1
6     B    2    NaN
7     B    2    NaN
8     A    4    0.2
9     A    2    0.1

使用不同的{key：value}字典在Groupby中运行

1 个答案: