我在网址的每个页面上提取第一个“名称”字段:“http://www.srlworld.com/content/65/find-a-lab.html”
for循环运行一次并抛出错误:
File "srl.py", line 40, in <module>
print state.text
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 66, in text
return self._execute(Command.GET_ELEMENT_TEXT)['value']
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 404, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 195, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 170, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed since it was looked up
Stacktrace:
at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8981)
at Utils.getElementAt (file:///tmp/tmpPEHToH/extensions/fxdriver@googlecode.com/components/command-processor.js:8574)
at WebElement.getElementText (file:///tmp/tmpPEHToH/extensions/fxdriver@googlecode.com/components/command-processor.js:11722)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpPEHToH/extensions/fxdriver@googlecode.com/components/command-processor.js:12282)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///tmp/tmpPEHToH/extensions/fxdriver@googlecode.com/components/command-processor.js:603)
代码是:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
driver = webdriver.Firefox()
driver.get("http://www.srlworld.com/content/65/find-a-lab.html")
#assert "http" in driver.title
elem = driver.find_element_by_id("country")
#driver.implicitly_wait(5)
all_countries = elem.find_elements_by_tag_name("option")
country = all_countries[1]
print "country value is %s" % country.get_attribute("value")
country.click()
driver.implicitly_wait(2)
state_elem = driver.find_element_by_id("state")
all_states = state_elem.find_elements_by_tag_name("option")
del all_states[0]
for state in all_states:
print "start ",
print state.text
print "state value is %s" % state.get_attribute("value")
state.click()
driver.implicitly_wait(2)
driver.find_element_by_name("go").click()
name = driver.find_element_by_xpath("//div[span='Name'][1]/span/following-sibling::span[2]")
print name.text
print "end ",
print state.text
在运行此脚本时,只运行一次的for循环不会打印最后一个'state.text',即使我没有进行任何更改。
答案 0 :(得分:1)
考虑到异常的文本,会发生以下情况:每次按下“Go”按钮时,页面都会刷新(加载新数据,而不是通过AJAX,但通过实际刷新 - 这很重要),因此Selenium会检测到页面状态更改并在您尝试从其先前状态访问元素时引发异常。我建议使用以下算法来解决您的问题:
current_position = 1
while True:
try:
state_elem = driver.find_element_by_id("state")
all_states = state_elem.find_elements_by_tag_name("option")
state = all_states[current_position]
print "start ",
print state.text
print "state value is %s" % state.get_attribute("value")
state.click()
driver.implicitly_wait(2)
driver.find_element_by_name("go").click()
name = driver.find_element_by_xpath("//div[span='Name'][1]/span/following-sibling::span[2]")
print name.text
print "end ",
print state.text
current_position += 1
except:
break
这样,您每次都会在新生成的页面上选择下一个选项,并且您不应该获得之前的例外。