无法点击包含">"的下一页按钮标志

时间:2017-11-05 10:58:08

标签: python python-3.x selenium selenium-webdriver web-scraping

我已经在Python中编写了一些与Selenium结合使用的代码来从网站上获取一些电话号码。要查找任何州的内容,必须在相应的搜索框中输入城市名称,然后按搜索按钮。我用"奥兰多"以正确的方式做到了作为城市名称。然而,在按下搜索按钮时,出现了通过分页遍历不同页面的文档列表。除了单击下一个按钮,我的脚本可以完成所有这些操作。如何修改我的脚本以单击下一页按钮,直到不再有下一页按钮为止?提前谢谢。

我正在使用的链接:the link

脚本我尝试用:

from selenium import webdriver;import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get("above link")

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "input[name='city']"))).send_keys("Orlando")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, ".btn-primary"))).click()
time.sleep(5)

while True:

    try:

        link = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".pagination a")))

        if link.text == ">":

            print(link.text)
            link.click()
            time.sleep(5)

    except:

        break

driver.quit()

下一页元素位于:

<ul class="pagination">
    <li class="active"><span>1</span></li>
        <li><a href="javascript:void(0);" onclick="(function(tgt){var rt={};rt=$.parseJSON(unescape('%7B%22fname%22%3A%22%22%2C%22lname%22%3A%22%22%2C%22city%22%3A%22Naples%22%2C%22tcustom11%22%3A%22%22%2C%22icustom12%22%3A%22%22%2C%22uat_1%22%3A%22%22%2C%22icustom43%22%3A%220%22%2C%22near%22%3A%22%22%2C%22dist%22%3A%2210%22%2C%22id%22%3A%2258%22%2C%22lat%22%3A%22%22%2C%22lon%22%3A%22%22%2C%22co%22%3A%22%22%7D'));rt.p=2;soc.ajax('cp','ld','ajax',rt);})(this);return false;">2</a></li>
        <li><a href="javascript:void(0);" onclick="(function(tgt){var rt={};rt=$.parseJSON(unescape('%7B%22fname%22%3A%22%22%2C%22lname%22%3A%22%22%2C%22city%22%3A%22Naples%22%2C%22tcustom11%22%3A%22%22%2C%22icustom12%22%3A%22%22%2C%22uat_1%22%3A%22%22%2C%22icustom43%22%3A%220%22%2C%22near%22%3A%22%22%2C%22dist%22%3A%2210%22%2C%22id%22%3A%2258%22%2C%22lat%22%3A%22%22%2C%22lon%22%3A%22%22%2C%22co%22%3A%22%22%7D'));rt.p=2;soc.ajax('cp','ld','ajax',rt);})(this);return false;">&gt;</a></li>
    <li><a href="javascript:void(0);" onclick="(function(tgt){var rt={};rt=$.parseJSON(unescape('%7B%22fname%22%3A%22%22%2C%22lname%22%3A%22%22%2C%22city%22%3A%22Naples%22%2C%22tcustom11%22%3A%22%22%2C%22icustom12%22%3A%22%22%2C%22uat_1%22%3A%22%22%2C%22icustom43%22%3A%220%22%2C%22near%22%3A%22%22%2C%22dist%22%3A%2210%22%2C%22id%22%3A%2258%22%2C%22lat%22%3A%22%22%2C%22lon%22%3A%22%22%2C%22co%22%3A%22%22%7D'));rt.p=2;soc.ajax('cp','ld','ajax',rt);})(this);return false;">»</a></li>
</ul>

下一页按钮如下:

">"

3 个答案:

答案 0 :(得分:1)

你可以这样做:

wait = WebDriverWait(driver, 10)

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "input[name='city']"))).send_keys("Orlando")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, ".btn-primary"))).click()

while True:
    try:
        link = wait.until(EC.element_to_be_clickable((By.XPATH, "//div[@class='text-center']/ul[@class='pagination']/li/a[contains(text(), '>')]")))
        link.click()
        time.sleep(5)
    except:
        print("finish!")
        break

driver.quit()

答案 1 :(得分:1)

这个应该有效:

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get("http://www.facdl.org/page/find-a-lawyer")

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "input[name='city']"))).send_keys("Orlando")
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, ".btn-primary"))).click()
time.sleep(2)

while True:
    try:
        link = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, ">")))
        link.click()
        wait.until(EC.staleness_of(link))
    except:
        break

我已添加time.sleep(2)等待页面滚动并变为静态。另外wait.until(EC.staleness_of(link))等待新的按钮实例创建

答案 2 :(得分:1)

请试用此代码:

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get("http://www.facdl.org/page/find-a-lawyer")

wait.until(EC.presence_of_element_located((By.NAME, "city"))).send_keys("Orlando")
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, ".btn.btn-primary"))).click()

while True:

    try:

        link = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, ">")))
        link.click()
        time.sleep(2)

    except:

        break