使用Python 2.7,我有一个函数需要通过它创建一个新列,然后从该新列中创建一个第二列:
def read_assign(fp, col_name):
df = pd.read_csv(fp).assign(model_id=col_name)
df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup),
error = np.log10((df.analytic_sol - df.avg_th_cond)/df.analytic_sol))
return df
当前,我收到一条错误消息,说它无法将df.analytic_sol
识别为df
的属性。我是否必须创建一个全新的变量并第二次分配?有一个更好的方法吗?
当前,此代码有效,但对我而言似乎效率不高:
def read_assign(fp, col_name):
df = pd.read_csv(fp).assign(model_id=col_name)
df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup))
df = df.assign(error = np.log10((df.analytic_sol - df.avg_th_cond)/df.analytic_sol))
return df
答案 0 :(得分:2)
python 3.6+
尝试使用lambda funcions
进行书写。之所以可行,是因为在assign
函数完成并返回之前,列不是“ assigned” 。
因此,在第一个assign
调用中,df['analytic_sol']
尚不存在...但是对于lambda
,您实际上是在函数中引用“自我”,而确实是已具有列analytic_sol
。
def read_assign(fp, col_name):
df = pd.read_csv(fp).assign(model_id=col_name)
df = df.assign(analytic_sol = k95(df.average_fuel_T, df.average_rod_burnup),
error = lambda x: np.log10((x['analytic_sol'] - df.avg_th_cond) / x['analytic_sol']))
return df