我使用selenium在python中编写了一些脚本,以便从redmart网站上获取不同产品的名称和价格。我的目标是点击主页面上方的10个中的每个类别,并解析所有转到目标页面的产品。但是,当单击某个类别时,浏览器位于新打开的页面上,因此此时需要再次访问主页面以单击10个类别链接中的另一个。我的刮刀点击链接,转到目标页面,从那里解析数据,返回主页面并点击同一链接,然后反复进行其余操作。这是我尝试的脚本:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("https://redmart.com/bakery")
wait = WebDriverWait(driver, 10)
while True:
try:
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "li.image-facets-pill")))
driver.find_element_by_css_selector('img.image-facets-pill-image').click()
except:
break
for elems in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "li.productPreview"))):
name = elems.find_element_by_css_selector('h4[title] a').text
price = elems.find_element_by_css_selector('span[class^="ProductPrice__"]').text
print(name, price)
driver.back()
driver.quit()
是的,我认为有必要调整一下"尝试"和"除了"在此脚本中阻止以获得所需的输出。
答案 0 :(得分:1)
您可以实现简单的计数器,以便您按如下方式遍历类别列表:
counter = 0
while True:
try:
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "li.image-facets-pill")))
driver.find_elements_by_css_selector('img.image-facets-pill-image')[counter].click()
counter += 1
except IndexError:
break
for elems in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "li.productPreview"))):
name = elems.find_element_by_css_selector('h4[title] a').text
price = elems.find_element_by_css_selector('span[class^="ProductPrice__"]').text
print(name, price)
driver.back()
driver.quit()