用另一列的条件模式填充列

时间:2019-06-28 20:50:10

标签: python pandas

鉴于下面的列表,我想用“颜色”列的模式(以“类型”和“大小”为条件)填充“颜色猜测”列,并忽略NULL,#N / A等。

例如,小型猫最常见的颜色是什么,中型犬最常见的颜色是什么,等等。

Type  Size    Color   Color Guess
Cat   small   brown   
Dog   small   black   
Dog   large   black   
Cat   medium  white   
Cat   medium  #N/A    
Dog   large   brown   
Cat   large   white   
Cat   large   #N/A    
Dog   large   brown   
Dog   medium  #N/A    
Cat   small   #N/A    
Dog   small   white   
Dog   small   black   
Dog   small   brown   
Dog   medium  white   
Dog   medium  #N/A    
Cat   large   brown   
Dog   small   white   
Dog   large   #N/A

1 个答案:

答案 0 :(得分:5)

正如BarMar在评论中所述,我们可以在链接的答案中使用pd.Series.mode。这里唯一的技巧是,我们必须使用groupby.transform,因为我们希望数据恢复为与数据框相同的形状:

df['Color Guess'] = df.groupby(['Type', 'Size'])['Color'].transform(lambda x: pd.Series.mode(x)[0])

   Type    Size  Color Color Guess
0   Cat   small  brown       brown
1   Dog   small  black       black
2   Dog   large  black       brown
3   Cat  medium  white       white
4   Cat  medium    NaN       white
5   Dog   large  brown       brown
6   Cat   large  white       brown
7   Cat   large    NaN       brown
8   Dog   large  brown       brown
9   Dog  medium    NaN       white
10  Cat   small    NaN       brown
11  Dog   small  white       black
12  Dog   small  black       black
13  Dog   small  brown       black
14  Dog  medium  white       white
15  Dog  medium    NaN       white
16  Cat   large  brown       brown
17  Dog   small  white       black
18  Dog   large    NaN       brown