我在pandas数据框中有几列。基于每一列,我需要创建一个新列。这个功能有效:
def f(row):
if row['col_1'] == 0:
val = 'Neutral'
elif row['col_1'] > 0:
val = 'Growth'
else:
val = 'Contraction'
return val
df['New_Col_1'] = df.apply(f(row) , axis=1)
但由于我有几列用于比较(col_2,col_3等),我想将列的名称作为参数传递给函数。
def f(row,col_name):
if row[col_name] == 0:
val = 'Neutral'
elif row[col_name] > 0:
val = 'Growth'
else:
val = 'Contraction'
return val
df['New_Col_1'] = df.apply(f(row,'col_1') , axis=1)
但是,有一个错误。它表示参数'row'未定义。我该如何克服这个问题?
答案 0 :(得分:3)
结帐df.loc[]
,您可以将两个参数视为行规格和列规格,因此您可以像这样使用它:
df['New_Col_1'] = 'Contraction' # Default, to be overwritten below
df.loc[df['col_1'] == 0, 'New_Col_1'] = 'Neutral'
df.loc[df['col_1'] > 0, 'New_Col_1'] = 'Growth'
答案 1 :(得分:1)
您可以使用 df.loc [condition,column_name] = value 来过滤df并写入新值:
df['New_Col_1'] = None # initial
df.loc[df.col1==0, 'New_Col_1'] = 'Neutral'
df.loc[df.col1>0, 'New_Col_1'] = 'Growth'
df.loc[df.col1<0, 'New_Col_1'] = 'Contraction'
答案 2 :(得分:1)
df.apply()缺少评论中提到的lambda函数。
def f(row,col_name):
if row[col_name] == 0:
val = 'Neutral'
elif row[col_name] > 0:
val = 'Growth'
else:
val = 'Contraction'
return val
df['New_Col_1'] = df.apply(lambda row: f(row,'col_1') , axis=1)