Question

我的主要目的是访问该特定网站，单击每个产品，有足够的时间从单击的产品中抓取数据，然后返回以单击页面中的另一个产品，直到所有产品都被点击为止并抓取（我未包含的抓取代码）。

我的代码打开chrome重定向到我想要的网站，生成了一个按class_name单击的链接列表。这是我要坚持的部分，我相信我需要一个for循环来迭代链接列表以单击并返回到原始链接。但是，我不知道为什么这行不通。

这是我的代码：

import csv
import time
from selenium import webdriver
import selenium.webdriver.chrome.service as service
import requests
from bs4 import BeautifulSoup


url = "https://www.vatainc.com/infusion/adult-infusion.html?limit=all"
service = service.Service('path to chromedriver')
service.start()
capabilities = {'chrome.binary': 'path to chrome'}
driver = webdriver.Remote(service.service_url, capabilities)
driver.get(url)
time.sleep(2)
links = driver.find_elements_by_class_name('product-name')


for link in links:
    link.click()
    driver.back()
    link.click()

Answer 1

对于您的问题，我还有另一种解决方法。

当我测试您的代码时，它表现出奇怪的行为。修复了我使用xpath时遇到的所有问题。

url = "https://www.vatainc.com/infusion/adult-infusion.html?limit=all"
driver.get(url)
links = [x.get_attribute('href') for x in driver.find_elements_by_xpath("//*[contains(@class, 'product-name')]/a")]
htmls = []
for link in links:
    driver.get(link)
    htmls.append(driver.page_source)

我没有来回移动而是保存了所有链接（称为链接）并遍历此列表。

使用硒在一页上单击多个项目

1 个答案: