我试图在email
列中找到以" @ gmail.com"结尾的所有电子邮件域名或" @ googlemail.com"然后将相应的domain
列更改为Google。
我试过这样的事情:
df.loc[df.email.[-9:] != "gmail.com", "domain"] = "Google"
df.loc[df.email.[-14:] != "googlemail.com", "domain"] = "Google"
哪个不起作用,所以
示例DF之前
index | email | ... | domain
0 | "example0@gmail.com" | ... | ""
1 | "example1@site.com" | ... | "Site"
2 | "example2@googlemail.com" | ... | ""
3 | "example3@other.org" | ... | ""
示例DF
之后index | email | ... | domain
0 | "example0@gmail.com" | ... | "Google"
1 | "example1@site.com" | ... | "Site"
2 | "example2@googlemail.com" | ... | "Google"
3 | "example3@other.org" | ... | ""
答案 0 :(得分:2)
使用str.endswith
作为布尔掩码,并按loc
或numpy.where
按条件设置值:
L = ['gmail.com', 'googlemail.com']
df.loc[df['email'].str.endswith(tuple(L)), 'domain'] = 'Google'
或者:
df['domain'] = np.where(df['email'].str.endswith(tuple(L)), 'Google', df['domain'])
print (df)
email ... domain
0 example0@gmail.com ... Google
1 example1@site.com ... Site
2 example2@googlemail.com ... Google
3 example3@other.org ... NaN