如何处理Python中的异常

时间:2019-11-11 01:48:49

标签: python beautifulsoup

这回答了我的问题,问题已解决。您可以删除此帖子。

RETRIES = 10

id = None
session = requests.Session()

for attempt in range(1, RETRIES + 1):
    response = session.get(url)
    soup = BeautifulSoup(r.text, "lxml")

    element = soup.find('a', class_="class", id=True)
    if element is None:
        print("Attempt {attempt}. Element not found".format(attempt=attempt))
        continue
    else:
        id = element["id"]
        break

print(id)

这回答了我的问题,问题已解决。您可以删除此帖子。

2 个答案:

答案 0 :(得分:0)

您可以应用“跳跃前先看”(LBYL)原理并检查find()的结果-如果未找到元素,它将返回None。然后,您可以将其放入循环并在有值时退出,并通过循环计数器限制来保护自己:

RETRIES = 10

id = None
session = requests.Session()

for attempt in range(1, RETRIES + 1):
    response = session.get(url)
    soup = BeautifulSoup(r.text, "lxml")

    element = soup.find('a', class_="class", id=True)
    if element is None:
        print("Attempt {attempt}. Element not found".format(attempt=attempt))
        continue
    else:
        id = element["id"]
        break

print(id)

夫妇笔记:

  • id=True被设置为仅查找存在id元素的元素。您也可以使用CSS selector soup.select_one("a.class[id]")
  • Session()有助于多次向同一主机发出请求时提高性能。在Session Objects
  • 中查看更多信息

答案 1 :(得分:-1)

如果您只想第二次发出相同的请求,则可以执行以下操作:

import requests
from bs4 import BeautifulSoup

def find_data(url):
    found_data = False
    while not found_data:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, "lxml")
        try:
            id = soup.find('a', class_="class").get('id')
            found_data = True
        except:
            pass

如果数据确实不存在,这将使您处于无限循环的风险。您可以这样做以避免无限循环:

import requests
from bs4 import BeautifulSoup

def find_data(url, attempts_before_fail=3):
    found_data = False
    while not found_data:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, "lxml")
        try:
            id = soup.find('a', class_="class").get('id')
            found_data = True
        except:
            attempts_before_fail -= 1
            if attempts_before_fail == 0:
                raise ValueError("couldn't find data after all.")