使用selenium ide和python在xpath循环两次迭代后失败我的程序

时间:2015-03-13 06:18:38

标签: python selenium selenium-webdriver

两次迭代后,它失败并显示错误。

如果由于没有通过selenium ide找到xpath而发生这种情况,那么为什么它在循环的第二次迭代中不会失败。

如何在没有任何错误的情况下获得输出并逐个命中所有8个Url,或者xpath是否可用。

这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

#base Url
baseurl="http://www.incredibleindia.org"
driver = webdriver.Firefox()
driver.implicitly_wait(2)
driver.get(baseurl)

driver.implicitly_wait(2)
main_links_tabs=driver.find_elements_by_xpath("html/body/div[3]/div/div[1]/div[2]/ul/li/a")
all_tablength=len(main_links_tabs)
print all_tablength
main_link_list=[]
for i in range(all_tablength): 
    driver.implicitly_wait(3) 
    links=main_links_tabs[i].get_attribute('href')
    main_link_list.append(links) 
#all main_tab_link hit one by one  
for i in  main_link_list:
    print i
    driver.implicitly_wait(30) 
    driver.get(i) 

    #travel tabs data
    print "tabl links hit one by one"

    travel_tabs_sublinks=driver.find_elements_by_xpath(".//*[@id='left-inner-content']/div[2]/div/ul/li/a")

    travel_tabs_sublinks_len=len(travel_tabs_sublinks)
    print travel_tabs_sublinks_len

输出:

8
http://www.incredibleindia.org/en/travel
tabl links hit one by one
http://www.incredibleindia.org/en/trade
tabl links hit one by one
http://www.incredibleindia.org/en/#media
tabl links hit one by one
Traceback (most recent call last):
File "incredibleindia.py", line 27, in <module>
travel_tabs_sublinks=driver.find_elements_by_xpath(".//*[@id='left-        inner-content']/div[2]/div/ul/li/a")
File "/usr/local/lib/python2.7/dist-     packages/selenium/webdriver/remote/webdriver.py", line 244, in     find_elements_by_xpath
return self.find_elements(by=By.XPATH, value=xpath)
File "/usr/local/lib/python2.7/dist-  packages/selenium/webdriver/remote/webdriver.py", line 679, in find_elements
{'using': by, 'value': value})['value']
File "/usr/local/lib/python2.7/dist- packages/selenium/webdriver/remote/webdriver.py", line 175, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist- packages/selenium/webdriver/remote/errorhandler.py", line 166, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSelectorException: Message: The  given selector .//*[@id='left-inner-content']/div[2]/div/ul/li/a is either  invalid or does not result in a WebElement. The following error occurred:
InvalidSelectorError: Unable to locate an element with the xpath   expression .//*[@id='left-inner-content']/div[2]/div/ul/li/a because of  the following error:
TypeError: Argument 1 of Document.createNSResolver is not an object.
Stacktrace:
at FirefoxDriver.annotateInvalidSelectorError_ (file:///tmp/tmpDolyM9/extensions/fxdriver@googlecode.com/components/driver-component.js:10245)
at FirefoxDriver.prototype.findElementsInternal_ (file:///tmp/tmpDolyM9/extensions/fxdriver@googlecode.com/components/driver-component.js:10303)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///tmp/tmpDolyM9/extensions/fxdriver@googlecode.com/components/driver-component.js:603)

1 个答案:

答案 0 :(得分:2)

首先,您没有正确使用implicitly_wait()。它不会只是睡眠N秒,它实际上会立即执行 - 它说驱动程序每次搜索元素时要等多少:

  

隐式等待是告诉WebDriver对DOM进行轮询   尝试查找一个或多个元素的时间量   没有立即可用。默认设置为0.一旦设置,   隐式等待是为WebDriver对象实例的生命周期设置的。

相反,您需要使用Explicit Waits。以下是代码的改进工作版本:

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


url = "http://www.incredibleindia.org"
driver = webdriver.Firefox()
driver.get(url)

# wait for menu to being loaded
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.menu li > a")))

links = [a.get_attribute('href') for a in driver.find_elements_by_css_selector('div.menu li > a')]
for link in links:
    driver.get(link)

    # wait for menu to being loaded
    try:
        WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div#left-inner-content li > a")))
    except TimeoutException:
         print driver.title, "No sublinks"

    sublinks = driver.find_elements_by_css_selector("div#left-inner-content li > a")
    print driver.title, [sublink.text for sublink in sublinks]

打印:

Incredible India - Travel [u'Rural Tourism', u'Mountain Trains & Luxury Trains', u'Eco Tourism', u'MICE', u'All Destinations']
Incredible India - Trade No sublinks
...