Question

我使用xpath和web驱动程序为2个不同的链接创建了一个python程序。我想得到2 ID出现的价格。这个程序从2个不同的页面运行，这就是价格有2个ID的原因。我用过try和except但它不起作用。我附上了代码。现在我得到IndexError：list index超出范围。我将不胜感激任何帮助。如果你愿意，可以问我任何问题。

from selenium import webdriver
import csv

# set the proxies to hide actual IP

proxies = {
    'http': 'http://218.50.2.102:8080',
    'https': 'http://185.93.3.123:8080',
}

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % proxies)

driver = webdriver.Chrome(executable_path="C:\\Users\Andrei\Downloads\chromedriver_win32\chromedriver.exe",
                          chrome_options=chrome_options)
header = ['Product title', 'Product price']

with open('csv/products.csv', "w") as output:
    writer = csv.writer(output)
    writer.writerow(header)
links = ['https://www.amazon.com/Windsor-Glider-Ottoman-White-Cushion/dp/B017XRDV5S/ref=sr_1_1?s=home-garden&ie=UTF8&qid=1520265105&sr=1-1&keywords=-gggg&th=1',
         'https://www.amazon.com/Instant-Pot-Multi-Use-Programmable-Packaging/dp/B00FLYWNYQ/ref=sr_1_1?s=home-garden&ie=UTF8&qid=1520264922&sr=1-1&keywords=-gggh']
for i in range(len(links)):
    driver.get(links[i])
    product_title = driver.find_elements_by_xpath('//*[@id="productTitle"][1]')
    prod_title = [x.text for x in product_title]
    try:
        product_price = driver.find_elements_by_xpath('//*[@id="priceblock_ourprice"][1]')
        prod_price = [x.text for x in product_price]
    except:
        print('no price v1')
    try:
        product_price = driver.find_elements_by_xpath('//*[@id="_price"][1]')
        prod_price = [x.text for x in product_price]
    except:
        print('no price v2')
        
    csvfile = 'csv/products.csv'

    data = [prod_title[0], prod_price[0]]

    with open(csvfile, "a", newline="") as output:
        writer = csv.writer(output)
        writer.writerow(data)

Answer 1

确定。我想我找出了你的问题

您正在搜索find_elements_by_xpath的元素列表。在这种情况下，当没有发现任何事情时，硒不会引发异常。它返回一个空列表。所以prod_price = [x.text for x in product_price]赋值在try..except子句中都有效。最后，你可能有一个空的prod_price。

您需要检查prod_price是否空虚，然后才搜索备用xpath

prod_price = [x.text for x in product_price]
if not prod_price:
    print('no price v')
    product_price = driver.find_elements_by_xpath(......

或使用find_element_by_xpath引发异常，使用xpath进行一次元素搜索

try:
    product_price = driver.find_element_by_xpath('(//*[@id="priceblock_ourprice"])[1]')
    prod_price = product_price.text
except:
.........

P.S。您可以通过迭代使用pythonic迭代方式

for link in links:
    driver.get(links)

在选择带有xpath的元素后，我有IndexError：list index超出范围

1 个答案: