我有一个包含3列的数据框
Hospital 2009-10 2010-11
Aberystwyth Mental Health Unit 19 19
Bro Ddyfi Community Hospital 16 10
Bronglais General Hospital 160 148
Caebryn Mental Health Unit 37 39
Carmarthen Mental Health Unit 38 31
我正在尝试创建一个函数来检查一个单词是否在医院列中,如果是这样,它会将单词放在新列中,如下所示:
Hospital 2009-10 2010-11 Hospital Type
Aberystwyth Mental Health Unit 19 19 Mental
Bro Ddyfi Community Hospital 16 10 Community
Bronglais General Hospital 160 148 General
Caebryn Mental Health Unit 37 39 Mental
Carmarthen Mental Health Unit 38 31 Mental
继承我尝试过的代码:
def find_type(x):
if df['Hospital'].str.contains("Mental").any():
return "Mental"
if df['Hospital'].str.contains("Community").any():
return "Community"
else:
return "Other"
df['Hospital Type'] = df.apply(find_type)
我得到的输出是:
Hospital 2009-10 2010-11 Hospital Type
Aberystwyth Mental Health Unit 19 19 NaN
Bro Ddyfi Community Hospital 16 10 NaN
Bronglais General Hospital 160 148 NaN
Caebryn Mental Health Unit 37 39 NaN
Carmarthen Mental Health Unit 38 31 NaN
我怎样才能得到它,就像预期的输出一样?
谢谢!
答案 0 :(得分:5)
pat = r"(Mental|Community)"
df['Hospital Type'] = df['Hospital'].str.extract(pat, expand=False).fillna('Other')
print (df)
Hospital 2009-10 2010-11 Hospital Type
0 Aberystwyth Mental Health Unit 19 19 Mental
1 Bro Ddyfi Community Hospital 16 10 Community
2 Bronglais General Hospital 160 148 Other
3 Caebryn Mental Health Unit 37 39 Mental
4 Carmarthen Mental Health Unit 38 31 Mental