Question

Abonnenten = ["https://www.instagram.com/therock/",
          "https://www.instagram.com/selenagomez/",
          "https://www.instagram.com/wizkhalifa/",
          "https://www.instagram.com/kanyewest/",
          "https://www.instagram.com/lilmosey/"]

在这里，我有一些instagram用户及其网址，然后我将进行for循环。

for i in range(len(Abonnenten)):
    driver.implicitly_wait(5) #i made a wait so my browser can catch up
    driver.get(Abonnenten[i]) #that is what i thought would be correct
    # get the text from their instagram bio
    wait = WebDriverWait(driver, 10)
    bio = wait.until(EC.presence_of_element_located((By.XPATH, "//div[@class='-vDIg']/span"))).text

如果我启动程序，它将加载直到涉及到代码的这一部分。它获取第一个网址，等待3秒钟左右，然后加载第二个网址并停留在该网址上。然后我得到这个错误

"Traceback (most recent call last):
  File "C:/Users/xxx/PycharmProjects/Website_Instagram_Browser_Scrap/main.py", line 58, in <module>
    bio = wait.until(EC.presence_of_element_located((By.XPATH, "//div[@class='-vDIg']/span"))).text
  File "C:\Users\xxx\PycharmProjects\Website_Instagram_Browser_Scrap\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: "

我认为在“ bio = ...”上方添加一个等待会有所帮助，但这并没有改变

Answer 1

查看您的URL，似乎错误出现在Wiz Khalifa的instagram（https://www.instagram.com/wizkhalifa/）上，因为他没有简历。这导致TimeoutException，因为没有生物，我们搜索的元素根本不存在。我们可以添加一项检查以查看用户是否有个人简历，如果他们没有个人简历，只需转到下一个URL：

from selenium.common.exceptions import TimeoutException

# move implicit wait outside of loop, we only need to set it once
driver.implicitly_wait(5) #i made a wait so my browser can catch up

for i in range(len(Abonnenten)):
    driver.get(Abonnenten[i]) #that is what i thought would be correct

    # get the text from their instagram bio
    try: 
        wait = WebDriverWait(driver, 10)
        bio = wait.until(EC.presence_of_element_located((By.XPATH, "//div[@class='-vDIg']/span"))).text

    # case: the user does not have a bio, so just move on to the next one
    except TimeoutException:
        continue

我还将您的implicitly_wait语句移到了for循环之外，因为我们不需要多次设置。

使用网址循环列表时发生TimeoutException

1 个答案: