根据多个条件添加列

时间:2019-02-27 13:09:02

标签: python pandas if-statement conditional

我有一个愚蠢的问题。我的df看起来像这样:

       FID_2     STA_SID           s2            s1  Qh_STA  Qh_FID2  \
14 222143.00 26040713.00           0.00        0.00    8.00    17.00   
15 222143.00 26040713.00           0.00        8.00    6.00    17.00   
13 222143.00 26040713.00           6.00        8.00    3.00    17.00   
17       NaN 26033594.00 29445425.00        1707.00    5.00      nan   

我定义了以下功能和命令:

A = 0.8

def seekDO(row):
       if (row['Qh_STA'])/row['Qh_FID2'] < A :
          return 1
       if ((row['Qh_STA'] + row['s1'])/row['Qh_FID2'] < A) :
          return 1
       if ((row['Qh_STA'] + row['s1'] + row['s2']) / row['Qh_FID2'] < A) :
          return 1
       return 0

df['DO'] = df.apply (lambda row: seekDO(row),axis=1)

问题在于,DO我得到了

    DO   
14  1  
15  1  
13  1  
17  0 

代替

    DO   
14  1  
15  0  
13  0  
17  0 

您也许能看到我弄错了地方吗?

3 个答案:

答案 0 :(得分:2)

我相信您可以用所有列来测试每个条件,而不是循环,这很慢:

A = 0.8

m1 = df['Qh_STA']/df['Qh_FID2'] < A 
m2 = (df['Qh_STA'] + df['s1'])/df['Qh_FID2'] < A
m3 = (df['Qh_STA'] + df['s1'] + df['s2']) / df['Qh_FID2'] < A

如果所有条件都为AND,则需要用&True乘以df['DO'] = (m1 & m2 & m3).astype(int) print (df) FID_2 STA_SID s2 s1 Qh_STA Qh_FID2 DO 14 222143.0 26040713.0 0.0 0.0 8.0 17.0 1 15 222143.0 26040713.0 0.0 8.0 6.0 17.0 0 13 222143.0 26040713.0 6.0 8.0 3.0 17.0 0 17 NaN 26033594.0 29445425.0 1707.0 5.0 NaN 0 的链列进行匹配:

sudo pip install fastText

答案 1 :(得分:1)

也许在哪里;

condition = ((df['Qh_STA'])/df['Qh_FID2'] < A) | (((df['Qh_STA'] + (df['s1'])/df['Qh_FID2']) < A)) | (((df['Qh_STA'] + df['s1'] + (df['s2']) / df['Qh_FID2']) < A))

df['DO'] = np.where(condition, 1, 0)

答案 2 :(得分:1)

但是你应该得到

    DO   
    14  1  
    15  1  
    13  1  
    17  0

确实。

再次查看您的值。

    8 / 17 IS < 0.8
    6 / 17 IS < 0.8
    3 / 17 IS < 0.8

输出正确,但您期望输出不正确。