df = pd.DataFrame({'salary': [2000,5000,7000, 3500, 8000],'rate':[2,4,6.5,7,5],'other':[4000,2500,4200, 5000,3000],
'name':['bob','sam','ram','jam','flu'], 'last_name' :['bob','gan','ram', np.nan, 'flu' ]})
我的数据框为df1
,我需要使用基于以下条件的值填充新列:
如果'name'
等于'last_name'
,则'salary'+'other'
如果'last_name'
是null
,则'salary'+'other'
如果'name'
不等于'last_name'
,则('rate' * 'other')+'salary'
我尝试了以下代码,但未给出正确的结果:
if np.where(df["name"] == df["last_name"]) is True:
df['new_col'] = df['salary'] + df['other']
else:
df['new_col'] = (df['rate'] * df['other']) + df['salary']
答案 0 :(得分:1)
您可以使用pandas DataFrame过滤一次完成这些操作。当您执行类似df["name"] == df["last_name"]
之类的操作时,您将创建一个布尔系列(称为“掩码”),然后可以将其用于索引到DataFrame中。
# condition 1 - name == last name
name_equals_lastname = df["name"] == df["last_name"] # first, create the boolean mask
df.loc[name_equals_lastname, "new_col"] = df["salary"] + df["other"] # then, use the mask to index into the DataFrame at the correct positions and just set those values
# condition 2 - last name is null
last_name_is_null = df["last_name"].isnull()
df.loc[last_name_is_null, "new_col"] = df["salary"] + df["other"]
# condition 3 - name != last name
name_not_equal_to_last_name = df["name"] != df["last_name"]
df.loc[name_not_equal_to_last_name, "new_col"] = (df["rate"] * df["other"]) + df["salary"]
您还可以将df.apply()
与自定义功能一起使用,如下所示:
def my_logic(row):
if row["name"] == row["last_name"]:
return row["salary"] + row["other"]
elif ... # you can fill in the rest of the logic here
df["new_col"] = df.apply(my_logic, axis=1) # you need axis=1 to pass rows rather than columns
答案 1 :(得分:0)
根据您的条件,您不需要if-else。只需将=ARRAYFORMULA(REGEXREPLACE(TRIM(FLATTEN(QUERY(TRANSPOSE(
IF(({B2:C, E2:E, G2:G}="")+
({B2:C, E2:E, G2:G}="no"),,
{B2:C, E2:E, G2:G}&",")),,9^9))), ",$", ))
与布尔布尔掩码结合使用
np.where