如何将具有多个条件的熊猫数据帧列按行求和

时间:2019-02-11 12:07:15

标签: python pandas

我正在翻译熊猫的Excel公式。对具有指定条件的列进行计数并逐行累加。如果单元格,我必须每行计数  从所选列中满足给定条件,然后添加满足条件的计数。

我有数据框:

df:

a    b     c
14   x1    2
17   x2    2
0    x,1   3
1    x1    1

Excel公式:

= COUNTIFS($U2,14,$X2,"x2",$W2,2)+COUNTIFS($U2,17,$X2,"x2",$W2,2)+COUNTIFS(U2,14,$X2,"x1",$W2,2)

熊猫公式:

df['counted'] = (df[(df['a']==14) & (df['b']=='x2') & (df['c']==2)].count(axis=1)) + (df[(df['a']==17) & (df['b']=='x2') & (df['c']==2)].count(axis=1)) + (df[(df['a']==14) & (df['b']=='x1') & (df['c']==2)].count(axis=1))

我从熊猫公式中得到以下结果: df:

a    b     c   counted
14   x1    2      NaN
17   x2    2      NaN
0    x,1   3      NaN
1    x1    1      NaN

预期结果如下所示。对于获得正确公式的任何帮助,将不胜感激。

预期结果 df:

a    b     c   counted
14   x1    2      0
17   x2    2      1
0    x,1   3      0
1    x1    1      0

1 个答案:

答案 0 :(得分:2)

我相信您需要将sum布尔掩码转换为整数:

a = (df['a']==14) & (df['b']=='x2') & (df['c']==2)
b = (df['a']==17) & (df['b']=='x2') & (df['c']==2)
c = (df['a']==14) & (df['b']=='x1') & (df['c']==2)

也可能存在链条条件,以避免重复以获得更好的性能:

m1 = df['a']==14
m2 = df['b']=='x2'
m3 = df['c']==2
m4 = df['a']==17
m5 = df['b']=='x1'

a = m1 & m2 & m3
b = m4 & m2 & m3
c = m1 & m5 & m3

df['counted'] = a.astype(int)+ b.astype(int) + c.astype(int)
print (df)
    a    b  c  counted
0  14   x1  2        1
1  17   x2  2        1
2   0  x,1  3        0
3   1   x1  1        0

或按位OR链接掩码,然后转换为整数:

df['counted'] = (a | b | c).astype(int)