比较数据框中的列以获得所需的输出

时间:2018-08-14 10:14:02

标签: python python-3.x pandas

假设这是我的输入数据框:

Name  Death1  Return1   Death2  Return2
 A     Yes     Yes       NaN      NaN
 B     No      No        Yes      Yes
 C     Yes     Yes       Yes      Yes
 D     NaN     NaN       NaN      NaN

我正在计算一个字符死亡的次数,并将其存储在新列中。

# My approach.
def clean_deaths(row):
    num_deaths = 0
    cols = ['Death1', 'Death2'] 
    for c in cols:
        death = row[c]
        if pd.isnull(death) or death == 'NO':
            continue
        elif death == 'YES':
            num_deaths += 1
    return num_deaths

df['Deaths'] = df.apply(clean_deaths, axis=1)

我对自己的方法不满意。我想看看实现这一目标的其他方法。

output: 
Name  Death1  Return1   Death2  Return2  Deaths
 A     Yes     Yes       NaN      NaN      1
 B     No      No        Yes      Yes      1
 C     Yes     Yes       Yes      Yes      2
 D     NaN     NaN       NaN      NaN      0

1 个答案:

答案 0 :(得分:1)

我认为需要按名称或filter过滤列,按eq==)和每行的最后sum True个值进行比较:

df['Deaths'] = df[['Death1', 'Death2']].eq('Yes').sum(axis=1)
print (df)
  Name Death1 Return1 Death2 Return2  Deaths
0    A    Yes     Yes    NaN     NaN       1
1    B     No      No    Yes     Yes       1
2    C    Yes     Yes    Yes     Yes       2
3    D    NaN     NaN    NaN     NaN       0

df['Deaths'] = df.filter(like='Death').eq('Yes').sum(axis=1)

详细信息

print (df[['Death1', 'Death2']].eq('Yes'))
   Death1  Death2
0    True   False
1   False    True
2    True    True
3   False   False