我刚问了以下问题
Pandas: how can I pass a column name to a function that can then be used in 'apply'?
我收到了很好的答案。但是,这个问题有一个延伸,我忽略了,也很好奇。
我有一个功能:
def generate_confusion_matrix(row):
val=0
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 0):
val = 0
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 1):
val = 1
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 1):
val = 2
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 0):
val = 3
if row['pioped_logit_category'] == 2:
val = 4
return val
我希望这样通用:
def general_confusion_matrix(biopsy, column_name):
val=0
if biopsy==0:
if column_name == 0:
val = 0
elif column_name == 1:
val = 1
elif biopsy==1:
if column_name == 1:
val = 2
elif column_name == 0:
val = 3
elif column_name == 2:
val = 4
return val
这样我就可以在这个函数中应用它(这不起作用)。
def create_logit_value(df, name_of_column):
df[name_of_column + '_concordance'] = df.apply(lambda : general_confusion_matrix('biopsy', name_of_column + '_category'), axis=1)
问题似乎是当你以df ['biopsy']传递列时,你将一系列传递给general_confusion_matrix函数而不是每行的值和条件语句抛出和通常的
('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')"
我已尝试过map和apply但我不确定如何将2个引用我的数据帧中的列的参数传递给lambda语句中的函数。我想我可以使用map,但同样,我如何通过它传递参数。我为写两个密切相关的问题而道歉,但他们是不同的。
答案 0 :(得分:1)
我认为你很接近:
df = pd.DataFrame({'biopsy_bin':[0,1,0,1,0,1],
'pioped_logit_category':[0,0,0,1,1,1],
'a_category':[0,0,0,1,1,1]})
print (df)
def create_logit_value(df, name_of_column):
df[name_of_column + '_concordance'] = df.apply(lambda x: generate_confusion_matrix(x['biopsy_bin'], x[name_of_column + '_category']), axis=1)
return (df)
create_logit_value(df, 'a')
create_logit_value(df, 'pioped_logit')
a_category biopsy_bin pioped_logit_category a_concordance \
0 0 0 0 0
1 0 1 0 3
2 0 0 0 0
3 1 1 1 2
4 1 0 1 1
5 1 1 1 2
pioped_logit_concordance
0 0
1 3
2 0
3 2
4 1
5 2