我有一个两列数据框的形式:
Death HEALTH
0 other 0.0
1 other 1.0
2 vascular 0.0
3 other 0.0
4 other 0.0
5 vascular 0.0
6 NaN 0.0
7 NaN 0.0
8 NaN 0.0
9 vascular 1.0
我想按照以下步骤创建一个新列:
输出应该是:
Death HEAlTH New
0 other 0.0 No
1 other 1.0 No
2 vascular 0.0 No
3 other 0.0 No
4 other 0.0 No
5 vascular 0.0 No
6 NaN 0.0 NaN
7 NaN 0.0 NaN
8 NaN 0.0 NaN
9 vascular 1.0 Yes
有没有pythonic的方法来实现这一点?我完全迷失在循环和条件之间。
答案 0 :(得分:0)
您可以为 No
和 Yes
创建条件,并为所有其他值在 numpy.select
中创建原始值:
m1 = df['Death'].eq('other') | (df['Death'].eq('vascular') & df['HEALTH'].eq(0))
m2 = (df['Death'].eq('vascular') & df['HEALTH'].eq(1))
df['new'] = np.select([m1, m2], ['No','Yes'], default=df['Death'])
另一个想法是测试缺失值,如果没有匹配条件设置原始值:
m1 = df['Death'].eq('other') | (df['Death'].eq('vascular') & df['HEALTH'].eq(0))
m2 = (df['Death'].eq('vascular') & df['HEALTH'].eq(1))
m3 = df['Death'].isna()
df['new'] = np.select([m1, m2, m3], ['No','Yes', np.nan], default=df['Death'])
print (df)
print (df)
0 another val 0.0 another val
1 other 1.0 No
2 vascular 0.0 No
3 other 0.0 No
4 other 0.0 No
5 vascular 0.0 No
6 NaN 0.0 NaN
7 NaN 0.0 NaN
8 NaN 0.0 NaN
9 vascular 1.0 Yes
答案 1 :(得分:0)
一种简单的方法是在函数内部使用 if/else 来实现您的条件逻辑,然后 apply
将此函数逐行传递到数据帧。
def function(row):
if row['Death']=='other':
return 'No'
if row['Death']=='vascular':
if row['Health']==1:
return 'Yes'
elif row['Health']==0:
return 'No'
return np.nan
# axis = 1 to apply it row-wise
df['New'] = df.apply(function, axis=1)
它根据需要产生以下输出:
Death Health New
0 other 0 No
1 other 1 No
2 vascular 0 No
3 other 0 No
4 other 0 No
5 vascular 0 No
6 NaN 0 NaN
7 NaN 0 NaN
8 NaN 0 NaN
9 vascular 1 Yes