我在这里遇到障碍。我必须翻译这个excel公式IF(COUNTIFS(advisor!$C:$C,$A2)=0,"0 disclosed", "Independent")
if df.groupby('id').apply(lambda x: x['id'] == df_advisor['company_id']).count() == 0:
df['auditor_compensation'] = '0 disclosed'
else:
df['auditor_compensation'] = 'Independent'
到目前为止,这是我不断得到的python-pandas解决方案 KeyError :(“ company_id”,“发生在索引1”)
任何帮助将不胜感激。
修改
df 样本数据:公司数据
id ticker iq_id company auditor_compensation
48299 ENXTAM:AALB IQ881736 Aalberts Industries ?
48752 ENXTAM:ABN IQ1090191 ABN AMRO Group ?
48865 ENXTAM:ACCEL IQ4492981 Accell Group ?
49226 ENXTAM:AGN IQ247906 AEGON ?
49503 ENXTAM:AD IQ373545 Koninklijke ?
以下是 df_advisor 示例数据
id type company_id advisor_company_id
1 auditor 48299 60911
17 auditor 48752 165120
6359 auditor 48865 73607
37 auditor 49226 81877
4415 compensation 49226 90258
53 auditor 49503 81877
因此,目标是检查company_id
中的整个列df_advisor
,并计算发生次数df['id']
,以便填充auditor_compensation
列。
答案 0 :(得分:1)
假设您想知道A列中的名称是否在Excel的C列的列表中。
df['Boolean'] = df['id'].isin(list(df_advisor['company_id'])
df['auditor_compensation'] = ''
df.loc[df['Boolean'] == False, 'auditor_compensation'] = '0 disclosed'
df.loc[df['Boolean'] == True, 'auditor_compensation'] = 'Independent'
答案 1 :(得分:1)
使用numpy.where
:
df['auditor_compensation'] = np.where(df['id'].isin(df_advisor['company_id']),
'0 disclosed',
'Independent')
print (df)
id ticker iq_id company auditor_compensation
0 48299 ENXTAM:AALB IQ881736 Aalberts Industries 0 disclosed
1 48752 ENXTAM:ABN IQ1090191 ABN AMRO Group 0 disclosed
2 48865 ENXTAM:ACCEL IQ4492981 Accell Group 0 disclosed
3 49226 ENXTAM:AGN IQ247906 AEGON 0 disclosed
4 49503 ENXTAM:AD IQ373545 Koninklijke 0 disclosed