Question

我用python和硒结合编写了一个脚本，以从网页中解析出一些公司的电子邮件。问题是电子邮件在span[data-mail]或span[data-mail-e-contact-mail]之内。如果我分别尝试这两个条件，则可以获得所有电子邮件。但是，当我尝试将它们包装在try:except:else块中时，它们将不再起作用。我要去哪里错了？

website link

这是脚本：

from selenium import webdriver
from bs4 import BeautifulSoup

url = "replace with the link above"

driver = webdriver.Chrome()
driver.get(url)
soup = BeautifulSoup(driver.page_source,'html.parser')
for links in soup.select("article.vcard"):
    try: #the following works when tried individually
        email = links.select_one(".hit-footer-wrapper span[data-mail]").get("data-mail")
    except: #the following works as well when tried individually
        email = links.select_one(".hit-footer-wrapper span[data-mail-e-contact-mail]").get("data-mail-e-contact-mail")
    else:
        email = ""
    print(email)
driver.quit()

当我执行上面的脚本时，它什么也不打印。如果单独打印，它们都可以工作。

Answer 1

请注意，您的代码不会引发异常，因为get("data-mail")和get("data-mail-e-contact-mail")都将返回值（是否为空），但不会返回异常

尝试以下代码以获取所需的输出：

for links in soup.select("article.vcard"):
    email = links.select_one(".hit-footer-wrapper span[data-mail]").get("data-mail") or links.select_one(".hit-footer-wrapper span[data-mail-e-contact-mail]").get("data-mail-e-contact-mail")
    print(email)

无法使用条件语句解析某些信息

1 个答案: