Python + selenium在单击最后一个按钮时抛出错误

时间:2017-07-20 14:46:55

标签: python python-3.x selenium selenium-webdriver web-scraping

我在python中用selenium编写了一些代码来解析站点中的名称。该网站有“下一步”按钮进入其下一页。我试图管理这个以完美地运行我的脚本。但是,我现在面临两个问题:

  1. 执行后,刮刀进入下一页并从那里进行解析,使起始页面无法解密,因为我无法修复逻辑。
  2. 如果找不到最后一个灰色的下一个按钮,则会抛出错误代码。
  3. 这是我到目前为止所尝试的内容:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.wait import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    driver = webdriver.Chrome()
    wait = WebDriverWait(driver, 10)
    
    driver.get("https://www.yellowpages.com/search?search_terms=pizza&geo_location_terms=San%20Francisco%2C%20CA&page=10")
    
    while True:
        wait.until(EC.visibility_of_element_located((By.XPATH, '//li/a[contains(@class,"next")]')))
    
        item = driver.find_element_by_xpath('//li/a[contains(@class,"next")]')
        if not driver.find_element_by_xpath('//li/a[contains(@class,"next")]'):
            break
        item.click()
    
        wait.until(EC.visibility_of_element_located((By.XPATH, '//div[@class="info"]')))
    
        for items in driver.find_elements_by_xpath('//div[@class="info"]'):
            name = items.find_element_by_xpath('.//span[@itemprop="name"]').text
            print(name)
    
    driver.quit()
    

    以下是灰色下一个按钮的元素:

    <div class="pagination"><p><span>Showing</span>361-388
    of 388<span>results</span></p><ul><li><a href="/search?search_terms=pizza&amp;geo_location_terms=San%20Francisco%2C%20CA&amp;page=12" data-page="12" data-analytics="{&quot;click_id&quot;:132}" data-remote="true" class="prev ajax-page" data-impressed="1">Previous</a></li><li><a href="/search?search_terms=pizza&amp;geo_location_terms=San%20Francisco%2C%20CA&amp;page=9" data-page="9" data-analytics="{&quot;click_id&quot;:132,&quot;module&quot;:1,&quot;listing_page&quot;:9}" data-remote="true" data-impressed="1">9</a></li><li><a href="/search?search_terms=pizza&amp;geo_location_terms=San%20Francisco%2C%20CA&amp;page=10" data-page="10" data-analytics="{&quot;click_id&quot;:132,&quot;module&quot;:1,&quot;listing_page&quot;:10}" data-remote="true" data-impressed="1">10</a></li><li><a href="/search?search_terms=pizza&amp;geo_location_terms=San%20Francisco%2C%20CA&amp;page=11" data-page="11" data-analytics="{&quot;click_id&quot;:132,&quot;module&quot;:1,&quot;listing_page&quot;:11}" data-remote="true" data-impressed="1">11</a></li><li><a href="/search?search_terms=pizza&amp;geo_location_terms=San%20Francisco%2C%20CA&amp;page=12" data-page="12" data-analytics="{&quot;click_id&quot;:132,&quot;module&quot;:1,&quot;listing_page&quot;:12}" data-remote="true" data-impressed="1">12</a></li><li><span class="disabled">13</span></li></ul></div>
    

1 个答案:

答案 0 :(得分:1)

显然你应该尝试切换抓页点击'下一步'按钮。您也可以使用try / except来避免刹车代码:

while True:
    # Scraping required elements first
    items = wait.until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="info"]')))
    for item in items:
        name = item.find_element_by_xpath('.//span[@itemprop="name"]').text
        print(name)
    # ...and then try to click 'Next' button
    try:
        driver.find_element_by_xpath('//li/a[contains(@class,"next")]').click()
    except:
        break