PYTHON:如何修复循环停止?

时间:2019-11-01 07:30:40

标签: python selenium

我试图从多个页面中提取24个以上的产品名称,但是我的for循环仅返回1个产品名称。我需要脚本转到单个产品页面并提取产品名称,然后返回 页面网址列表,然后重复相同的步骤。

下面的脚本返回第一个产品名称,然后停止。

from selenium import webdriver
import time

HROMEDRIVER_PATH = '/Users/reezalaq/PycharmProjects/wholesale/driver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
driver = webdriver.Chrome(CHROMEDRIVER_PATH, options=chrome_options)
chrome_options.accept_untrusted_certs = True
chrome_options.assume_untrusted_cert_issuer = True
chrome_options.headless = False

home = "https://www.blibli.com/c/4/beli--mukena/MU-1000008/54912?page=1&start=0&category=MU-1000008&sort=7&intent=false"
driver.get(home)
for number in range(1, 24):
    elem = driver.find_elements_by_xpath('//*[@id="catalogProductListContentDiv"]/div[3]/div[' + str(number) + ']/div/div/a')
    for link in elem:
        producturl = link.get_attribute("href")
        time.sleep(24)
        driver.get(producturl)
        getproductname = driver.find_element_by_class_name("product__name-text")
        print(getproductname.text)
driver.close()

1 个答案:

答案 0 :(得分:1)

我通过打开一个新标签页,然后切换到新标签页来完成此操作。

首先,等待直到所有元素都可见。

您可以使用time.sleep (24)代替WebDriverWait

这是您要表示的元素:

//div[@class="product__item"]/a

您可以尝试以下代码:

driver.get('https://www.blibli.com/c/4/beli--mukena/MU-1000008/54912?page=1&start=0&category=MU-1000008&sort=7&intent=false')

elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="product__item"]/a')))
for element in elements:
    url = element.get_attribute('href')
    #open new tab with specific url
    driver.execute_script("window.open('" +url +"');")
    #switch to new tab
    driver.switch_to.window(driver.window_handles[1])
    getproductname = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CLASS_NAME, 'product__name-text')))
    print(getproductname.text)
    #close current tab
    driver.close()
    #back to first tab
    driver.switch_to.window(driver.window_handles[0])
driver.quit()

正在导入:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC