Question

在使用字典创建新列值时，希望使用多个列来创建新列。下面是一个简单的示例：

df：

Col1     Col2    Col3
Dog      Bird    Cat
Blue     Red     Black
Bad      Sad     Glad

my_dict = {'Bird': 'AAA','Blue':'BBB','Glad':'ZZZ'}

所需的df：

Col1     Col2    Col3      NewCol
Dog      Bird    Cat       AAA
Blue     Red     Black     BBB
Bad      Sad     Glad      ZZZ

我玩过地图功能（df.NewCol = df.Col.map（my_dict））...但是它只允许我使用一列来搜索字典中的键。为了创建NewCol，我需要Col1，Col2和ANDCol3列来搜索我的词典。

有什么想法吗？谢谢！

Answer 1

选项1 ：将map与ffill一起使用。这并不假设每行有一个有效条目。

# this will take the last occurrence of valid entry in a row
# change to .bfill(1).iloc[:,0] to get the first
df['NewCol'] = df.apply(lambda x: x.map(my_dict)).ffill(1).iloc[:,-1]

选项2 ：map上的stack并分配。这种方法假设每行只有一个有效条目。

df['NewCol'] = (df.stack().map(my_dict)
                  .reset_index(level=1, drop=True)
                  .dropna()
               )

输出：

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Answer 2

全面了解Python内容

这更钝了...但是我认为这很有趣。在某些情况下可能更快，但可能不值得增加混乱。

df.assign(NewCol=[min(map(my_dict.get, t), key=pd.isna) for t in zip(*map(df.get, df))])

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Answer 3

另一种方法是在数据帧上使用replace并与df和ffill进行比较

df['NewCol'] = df.replace(my_dict).where(lambda x: x != df).ffill(1).iloc[:,-1]

Out[550]:
   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

或使用stack，droplevel

df['NewCol'] = df.replace(my_dict).where(lambda x: x != df).stack().droplevel(1)

Answer 4

如果一行仅具有一个键和一个键，则另一种方法将如下链接map，ravel和dropna：

df['NewCol'] = pd.Series(df.apply(lambda x: x.map(my_dict)).values.ravel()).dropna().values

输出：

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

使用多列同时将字典映射到数据框

4 个答案:

全面了解Python内容