我正在创建一个爬网程序,它将通过大量URL并经历多种情况。如果其中一种情况通过了,我想写点东西,并再次检查下一个URL,直到通过。
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=options)
urls = ['https://www.google.com/', 'https://stackoverflow.com/']
for url in urls:
driver.get(url)
image_name = url.split(".")[1] + ".png"
driver.save_screenshot(image_name)
performance_data = driver.execute_script('return window.performance.getEntries();')
for single_data in performance_data:
if "nav" in single_data["name"]:
file.write(url + "adserv_1")
break
#over here the loop should break and look for the new url rather than continuing the below?
driver.execute_script("window.scrollTo(0,1000);")
sleep(2)
driver.execute_script("window.scrollTo(0,5000);")
sleep(2)
driver.execute_script("window.scrollTo(0,10000);")
sleep(2)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
performance_data = driver.execute_script('return window.performance.getEntries();')
for single_data in performance_data:
if "nav" in single_data['name']:
results.write(url + "3")
如果第一种情况通过,代码是否不应该转到数组中的下一个URL?
答案 0 :(得分:1)
您的“中断”仅退出“用于单个数据”循环,而不退出“用于URL”循环。