根据其他两列中的条件文本值在熊猫中创建新列

时间:2020-10-02 19:24:41

标签: python pandas

如何基于其他两列中的条件文本值在熊猫中创建新列?

初始表格-

Specialty   Category  
Spec A      Cat A     
Spec A      Cat B     
Spec A      Cat C
Spec A      Cat D
Spec B      Cat A     
Spec B      Cat B     
Spec B      Cat C 
Spec B      Cat D    

条件逻辑= Cat A和Cat D未重命名为“其他” Cat B和Cat C重命名为“其他” 专业没有变化。 新列可根据上述逻辑将专业和类别连接起来。

此表将输出到-

Specialty   Category  Specialty_group
Spec A      Cat A     Spec A Cat A       
Spec A      Cat B     Spec A Other
Spec A      Cat C     Spec A Other
Spec A      Cat D     Spec A Cat D 
Spec B      Cat A     Spec B Cat A
Spec B      Cat B     Spec B Other
Spec B      Cat C     Spec B Other
Spec B      Cat D     Spec B Cat D

1 个答案:

答案 0 :(得分:1)

# create a mask based on your logic
mask = (df['Category'] == 'Cat A') | (df['Category'] == 'Cat D')
# assign a values to a new column using loc and join
df.loc[mask, 'Specialty_group'] = df[mask].agg(' '.join, axis=1)
# assign values to a column using loc with the opposite of your logic
df.loc[~mask, 'Specialty_group'] = df[~mask]['Specialty']+' Other'


  Specialty Category Specialty_group
0    Spec A    Cat A    Spec A Cat A
1    Spec A    Cat B    Spec A Other
2    Spec A    Cat C    Spec A Other
3    Spec A    Cat D    Spec A Cat D
4    Spec B    Cat A    Spec B Cat A
5    Spec B    Cat B    Spec B Other
6    Spec B    Cat C    Spec B Other
7    Spec B    Cat D    Spec B Cat D