我用python和硒结合编写了一个脚本,以从网页中解析出一些公司的电子邮件。问题是电子邮件在span[data-mail]
或span[data-mail-e-contact-mail]
之内。如果我分别尝试这两个条件,则可以获得所有电子邮件。但是,当我尝试将它们包装在try:except:else
块中时,它们将不再起作用。我要去哪里错了?
这是脚本:
from selenium import webdriver
from bs4 import BeautifulSoup
url = "replace with the link above"
driver = webdriver.Chrome()
driver.get(url)
soup = BeautifulSoup(driver.page_source,'html.parser')
for links in soup.select("article.vcard"):
try: #the following works when tried individually
email = links.select_one(".hit-footer-wrapper span[data-mail]").get("data-mail")
except: #the following works as well when tried individually
email = links.select_one(".hit-footer-wrapper span[data-mail-e-contact-mail]").get("data-mail-e-contact-mail")
else:
email = ""
print(email)
driver.quit()
当我执行上面的脚本时,它什么也不打印。如果单独打印,它们都可以工作。
答案 0 :(得分:2)
请注意,您的代码不会引发异常,因为get("data-mail")
和get("data-mail-e-contact-mail")
都将返回值(是否为空),但不会返回异常
尝试以下代码以获取所需的输出:
for links in soup.select("article.vcard"):
email = links.select_one(".hit-footer-wrapper span[data-mail]").get("data-mail") or links.select_one(".hit-footer-wrapper span[data-mail-e-contact-mail]").get("data-mail-e-contact-mail")
print(email)