创建新列时,您能否在Assign中引用新创建的列?

时间:2019-04-17 15:33:42

标签: python pandas

使用Python 2.7,我有一个函数需要通过它创建一个新列,然后从该新列中创建一个第二列:

def read_assign(fp, col_name):
    df = pd.read_csv(fp).assign(model_id=col_name)
    df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup),
                   error = np.log10((df.analytic_sol - df.avg_th_cond)/df.analytic_sol))
    return df

当前,我收到一条错误消息,说它无法将df.analytic_sol识别为df的属性。我是否必须创建一个全新的变量并第二次分配?有一个更好的方法吗?

当前,此代码有效,但对我而言似乎效率不高:

def read_assign(fp, col_name):
    df = pd.read_csv(fp).assign(model_id=col_name)
    df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup))
    df = df.assign(error = np.log10((df.analytic_sol - df.avg_th_cond)/df.analytic_sol))
    return df

1 个答案:

答案 0 :(得分:2)

对于python 3.6+

尝试使用lambda funcions进行书写。之所以可行,是因为在assign函数完成并返回之前,列不是“ assigned”

因此,在第一个assign调用中,df['analytic_sol']尚不存在...但是对于lambda,您实际上是在函数中引用“自我”,而确实是已具有列analytic_sol

def read_assign(fp, col_name):
    df = pd.read_csv(fp).assign(model_id=col_name)
    df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup),
                   error = lambda x: np.log10((x['analytic_sol'] - df.avg_th_cond) / x['analytic_sol']))
    return df