在给定条件下为新的Pandas列分配值

时间:2018-11-21 16:23:20

标签: python pandas dataframe

我是熊猫的新手

我想在Pandas中创建一个条件列。在R中,我可以使用Mutate来做到这一点,但是在Pandas.assign()中,这对我来说就没有意义。

我想用伪代码执行的操作是:

DataFrame.MyKeyColumn = If (DataFrame.Condtional is NaN) then:

concatenate[ DataFrame.keyfield1,"_",DataFrame.keyfield2,"_",DataFrame.keyfield3,"_",keyfield4] 
else:
concatenate[ DataFrame.keyfield1,"_",DataFrame.keyfield2,"_",DataFrame.condtionalfield,"_",DataFrame.keyfield3,"_",keyfield4]

在R中,您可以执行以下操作:

dplyr::mutate(Conditional = if(is.na(mycondtion)){paste(keyfield1,keyfield2)}, else {paste(keyfield1,condtionalfield,keyfield2)})

https://ionicframework.com/docs/api/navigation/NavController/

Example of my Current Data

任何帮助都将不胜感激。我希望我只是想念了解pandas.assign()的工作方式,或者我需要嵌套一些函数,例如pandas.where()。

1 个答案:

答案 0 :(得分:0)

您可以使用numpy的where来设置条件布尔逻辑以填充其他列,这是基于您的伪代码的示例:

df.MyKeyColumn = np.where(df.Condtional.isna(),
df.keyfield1+"_"+df.keyfield2+"_"+df.keyfield3+"_"+keyfield4,
df.keyfield1+"_"+df.keyfield2+"_"+df.condtionalfield+"_"+df.keyfield3+"_"+keyfield4)

这是用法的简化示例:

import pandas as pd
import numpy as np

# Create a dummy dataframe
df = pd.DataFrame(data={"col1":[np.nan, 1, np.nan], "col2":[4, 5, 6]})

# Create a new column which fills in missing col1 values with data from col2
df["new_col"] = np.where(df["col1"].isna(), df["col2"], df["col1"])

# Create a new column which fills in missing col1 values with scalar value
df["new_col2"] = np.where(df["col1"].isna(), 7, df["col1"])