AttributeError:“ DataFrame”对象没有属性“ Patient_col”

时间:2020-04-22 10:13:35

标签: python pandas dataframe

嗨,我的代码如下

def check_for_leakage(df1, df2, patient_col):

    df1_patients_unique = set(df1.patient_col.unique())
    df2_patients_unique = set(df2.patient_col.unique())
    patients_in_both_groups = list(df1_patients_unique.intersection(df2_patients_unique))
    leakage = len(patients_in_both_groups) > 0 # boolean (true if there is at least 1 patient in both groups)

    return leakage

当我跑步

# test
print("test case 1")
df1 = pd.DataFrame({'patient_id': [0, 1, 2]})
df2 = pd.DataFrame({'patient_id': [2, 3, 4]})
print("df1")
print(df1)
print("df2")
print(df2)
print(f"leakage output: {check_for_leakage(df1, df2, 'patient_id')}")

我收到以下错误:

AttributeError:“ DataFrame”对象没有属性“ Patient_col”

我已经尝试了几件事,但是我不明白如何解决这个问题。对于我的问题,我也找不到合适的答案。

2 个答案:

答案 0 :(得分:0)

您必须在方括号中调用列名称:


    df1_patients_unique = set(df1[patient_col].unique())
    df2_patients_unique = set(df2[patient_col].unique())

带有df1.column的表示法仅适用于实际的列名。您不能在此处输入变量。

答案 1 :(得分:0)

对功能进行以下更改:

def check_for_leakage(df1, df2, patient_col):

    df1_patients_unique = set(df1[patient_col].unique())
    df2_patients_unique = set(df2[patient_col].unique())
    patients_in_both_groups = list(df1_patients_unique.intersection(df2_patients_unique))
    leakage = len(patients_in_both_groups) > 0 # boolean (true if there is at least 1 patient in both groups)

    return leakage

print("test case 1")
df1 = pd.DataFrame({'patient_id': [0, 1, 2]})
df2 = pd.DataFrame({'patient_id': [2, 3, 4]})
print("df1")
print(df1)
print("df2")
print(df2)
print(f"leakage output: {check_for_leakage(df1, df2, 'patient_id')}")

Output:
test case 1
df1
   patient_id
0           0
1           1
2           2
df2
   patient_id
0           2
1           3
2           4
leakage output: True