目前正在尝试遍历此网站上的所有页面:
当它到达第 53 页(页面末尾)时,即使没有更多页面,它也会继续循环。我怎样才能让循环停止?我注意到元素类 ="disabled" 出现了。
这是我目前的代码:
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.implicitly_wait(10)
driver.get('https://ephisahs.microsoftcrmportals.com/disclaimer/restaurantinspections/south-facilities/')
dfs = []
page_counter = 0
while True:
wait = WebDriverWait(driver, 30)
wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[@data-name]")))
cards = driver.find_elements_by_xpath("//tr[@data-name]")
facilities = []
for card in cards:
name = card.find_element_by_xpath(".//td[@data-th='Unit Name']").text
street1 = card.find_element_by_xpath(".//td[@data-th='Site Street 1']").text
street2 = card.find_element_by_xpath(".//td[@data-th='Site Street 2']").text
site_city = card.find_element_by_xpath(".//td[@data-th='Site City']").text
site_prov = card.find_element_by_xpath(".//td[@data-th='Site Province/State']").text
site_code = card.find_element_by_xpath(".//td[@data-th='Site Postal Code/Zip Code']").text
site_fac = card.find_element_by_xpath(".//td[@data-th='Facility Category']").text
site_inspection = card.find_element_by_xpath(".//td[@data-th='Inspections Completed']").text
ref_link = card.find_element_by_xpath(".//td//a").get_attribute("href")
facilities.append([name, street1, street2, site_city,site_prov,site_code,site_fac,site_inspection,ref_link])
df = pd.DataFrame(facilities)
dfs.append(df)
print(page_counter)
page_counter+=1
try:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"a[aria-label='Next page']"))).click()
except:
break
driver.close()
driver.quit()
答案 0 :(得分:1)
您可以简单地检查 li
元素的类如文档中所述
is_disabled = "disabled" in element.get_attribute("class")
if is_disabled:
break
<块引用>
is_active = "active" in target_element.get_attribute("class")