如何在熊猫中进行条件比较

时间:2019-05-17 10:37:48

标签: python pandas

我在熊猫中有以下数据框

upload(file) {
    const formData = new FormData();
    formData.append('file', file);
    const req = new HttpRequest('POST', this.urlUpload, formData, {
      headers: new HttpHeaders({'Content-Type':'multipart/form-data'}),
      reportProgress: true
    });
    return this.http.request(req);
  }

我只想比较code prod_a prod_b flag 123 MS MS to be checked 123 HS MS more than 1 prod 123 MS HS to be checked 123 HS MS more than 1 prod 123 MS MS to be checked 和其他标记flag = to be checked保持原样的情况下的prod_a和prod_b。我想要的数据帧如下

more than 1 prod

如何在熊猫中做到这一点。


重新创建数据框:

code     prod_a      prod_b     flag               final_flag
123      MS          MS         to be checked      matched
123      HS          MS         more than 1 prod   more than 1 prod   
123      MS          HS         to be checked      not matched
123      HS          MS         more than 1 prod   more than 1 prod
123      MS          MS         to be checked      matched

3 个答案:

答案 0 :(得分:3)

numpy.select与链条条件一起使用,按&进行按位AND并按~求逆:

m1 = df['flag'].eq('to be checked')
m2 = df.prod_a.eq(df.prod_b)

df['final_flag'] = np.select([m1 & m2, m1 & ~m2],['matched','not matched'],default=df['flag'])
print (df)
   code prod_a prod_b              flag        final_flag
0   123     MS     MS     to be checked           matched
1   123     HS     MS  more than 1 prod  more than 1 prod
2   123     MS     HS     to be checked       not matched
3   123     HS     MS  more than 1 prod  more than 1 prod
4   123     MS     MS     to be checked           matched

@Anton vBR的解决方案:

m1 = df['flag'].eq('to be checked')
m2 = df.prod_a.eq(df.prod_b)

df['final_flag'] = df['flag']
df.loc[m1 & m2, 'final_flag'] = 'matched'
df.loc[m1 & ~m2, 'final_flag'] = 'not matched'
print (df)
   code prod_a prod_b              flag        final_flag
0   123     MS     MS     to be checked           matched
1   123     HS     MS  more than 1 prod  more than 1 prod
2   123     MS     HS     to be checked       not matched
3   123     HS     MS  more than 1 prod  more than 1 prod
4   123     MS     MS     to be checked           matched

答案 1 :(得分:0)

尝试:

df['final_flag'] = df.apply(lambda x : 'matched' if x['flag'] == 'to be checked' and x['prod_a'] == x['prod_b'] else 'not matched')

答案 2 :(得分:0)

def udf(row):
    if row.flag == 'to be checked':
        if row.prod_a == row.prod_b:
            return "matched"
        else:
            return "not matched"
    else:
        return row.flag

df['final_flag'] = df.apply(lambda row: udf(row), axis = 1)

这应该有效