如何在Pandas数据框中的其他列中填充基于新列的值

时间:2019-11-21 01:07:55

标签: python pandas dataframe if-statement

我的数据框df如下所示。

  df
DATA ELEMENT    PORT    LOCATION    P_RISK  S_RISK  W_RISK  Total_Risk
Event 1000     SGSIN                      0       0       0       0
Event 2000     MYKUL                      0       1       0       1
Event 3000                PARIS           0       0       1       1
Event 4000                LYON            1       0       0       1
Event 1000     USNYK                      0       0       0       0
Event 2000     INCOK                      0       0       1       1
Event 3000                MUMBAI          0       0       0       0
Event 4000                LAHORE          1       0       0       1
Event 2000     INCOK                      0       0       0       0
Event 3000                PARIS           0       0       0       0

条件

*如果Data Element是1000或2000,并且Total Risk是1,则用具有“风险名称+端口”的新列comment填充

*如果Data Element是3000或4000并且Total Risk是1,则用With comment填充新列Risk Name + LOCATION

*如果Total Risk为0,则填充“无风险”

预期产量

df
    DATA ELEMENT    PORT    LOCATION    P_RISK  S_RISK  W_RISK  Total_Risk  Comment
    Event 1000     SGSIN                      0       0       0       0     No Risk Indentified
    Event 2000     MYKUL                      0       1       0       1     S_RISK  predicted at  MYKUL
    Event 3000                PARIS           0       0       1       1     W_RISK predicted at  PARIS
    Event 4000                LYON            1       0       0       1     P_RISK  predicted at  LYON
    Event 1000     USNYK                      0       0       0       0     No Risk Indentified
    Event 2000     INCOK                      0       0       1       1     W_RISK predicted at  INCOK
    Event 3000                MUMBAI          0       0       0       0     No Risk Indentified
    Event 4000                LAHORE          1       0       0       1     P_RISK  predicted at  LAHORE
    Event 2000     INCOK                      0       0       0       0     No Risk Indentified
    Event 3000                PARIS           0       0       0       0     No Risk Indentified

这怎么办?

1 个答案:

答案 0 :(得分:3)

我将dotnp.where一起使用

s2=df.filter(like='_RISK')
s2=s2.dot(s2.columns)
df['new']=np.where(s2=='' ,'No Risk Indentified',s2 + ' predict at ' +df.PORT.mask(df.PORT=='',df.LOCATION))
df
Out[35]: 
    DATA  ELEMENT   PORT  ... W_RISK  Total_Risk                     new
0  Event     1000  SGSIN  ...      0           0     No Risk Indentified
1  Event     2000  MYKUL  ...      0           1   S_RISKpredict atMYKUL
2  Event     3000         ...      1           1   W_RISKpredict atPARIS
3  Event     4000         ...      0           1    P_RISKpredict atLYON
4  Event     1000  USNYK  ...      0           0     No Risk Indentified
5  Event     2000  INCOK  ...      1           1   W_RISKpredict atINCOK
6  Event     3000         ...      0           0     No Risk Indentified
7  Event     4000         ...      0           1  P_RISKpredict atLAHORE
8  Event     2000  INCOK  ...      0           0     No Risk Indentified
9  Event     3000         ...      0           0     No Risk Indentified
[10 rows x 9 columns]

方法2

s2=df.filter(like='_RISK').ne(0).stack()                                                                             
s2=s2[s2].reset_index(level=1)                                                                                       
df['new']=(s2['level_1'] + ' predict at ' +df.PORT.mask(df.PORT=='',df.LOCATION)).fillna('No Risk Indentified')