使用Wait并检查page_soure后,Selenium的NoSuchElement异常

时间:2016-09-02 01:05:29

标签: python selenium exception phantomjs

我正在运行这个简单的刮刀。我试图从sam.gov中搜索字母q的搜索结果:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import re
import sys  

reload(sys)  
sys.setdefaultencoding('utf8')
letter = 'q'

driver = webdriver.PhantomJS()
driver.set_window_size(1120, 550)

driver.get("http://sam.gov")

#element = WebDriverWait(driver, 10).until(
#                EC.presence_of_element_located((By.ID, "pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1"))
#            )
#element.click()
driver.find_element_by_id('pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1').click()

driver.find_element_by_id(letter).send_keys(letter)
driver.find_element_by_id('RegSearchButton').click()


def crawl():
    bsObj = BeautifulSoup(driver.page_source, "html.parser")
    tableList = bsObj.find_all("table", {"class":"width100 menu_header_top_emr"}) 
    tdList = bsObj.find_all("td", {"class":"menu_header width100"})

    for table in tableList:
        item = table.find_all("span", {"class":"results_body_text"})
        print item[0].get_text().strip() + ', ' + item[1].get_text().strip() 

if driver.find_element_by_id('anch_16'):
    crawl()
    driver.find_element_by_id('anch_16').click()
    print "Going to next page"
else:
    crawl()
    print "Done with last page" 

driver.quit()

当我运行它时会产生一个奇怪的错误,这让我很烦恼:

追踪(最近一次呼叫最后一次):

  File "save.py", line 22, in <module>
    driver.find_element_by_id('pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1').click()
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 269, in find_element_by_id
    return self.find_element(by=By.ID, value=id_)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 752, in find_element
    'value': value})['value']
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: {"errorMessage":"Unable to find element with id 'pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"153","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:40423","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"id\", \"sessionId\": \"eb7dfa50-70a7-11e6-b125-9ff4e2dbd485\", \"value\": \"pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/eb7dfa50-70a7-11e6-b125-9ff4e2dbd485/element"}}
Screenshot: available via screen

我在初始化浏览器后尝试使用60的隐式等待。没有运气

我也尝试了webdriverwait(在driver.get("http://sam.gov")下面的代码中注释掉了,它在TimeOutException时给了我。

奇怪的是,如果我在get调用之后立即执行print driver.page_source,那么源代码很好并且它包含以下代码,其中实际上包含具有我正在搜索的id的元素。也没有框架或iframe。

<a id="pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1" href="#" title="Search Records" onclick="if(typeof jsfcljs == 'function'){jsfcljs(document.getElementById('pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12'),{'pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1':'pbG220e071f_2de75f_2d417d_2d9c61_2d027d324c8fec:_viewRoot:j_id12:search1'},'');}return false" class="button">

1 个答案:

答案 0 :(得分:0)

元素的Id定位器看起来像是动态生成的,你应该尝试一些不同的定位器。

您可以尝试使用css_selector,如下所示: -

driver.find_element_by_css_selector("a.button[title='Search Records']").click()

或使用WebDriverWait作为: -

element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "a.button[title='Search Records']")))
element.click()

注意: - 在找到元素之前,请确保它不在任何frame/iframe内。如果它位于任何frame/iframe内,您需要在找到frame/iframe

元素之前切换driver.switch_to_frame("frame/iframe id or name")