我试图遍历Link中的公司列表。每个公司名称的链接都是动态的,例如http://ae.bizdirlib.com/node/946273 - 文本链接946273不断变化,即动态。我想在浏览器中打开页面中的每个链接,我真的很困惑如何做到这一点。我现在试过这个。
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import time
browser = webdriver.Firefox() # Get local session of firefox
#wait until the pages are loaded
browser.implicitly_wait(3)
browser.get("http://ae.bizdirlib.com/taxonomy/term/1493") # Load page
browser.refresh()
page_source = browser.page_source
for node in page_source:
link = browser.find_element_by_link_text('node').click
执行此代码时会出现错误
Traceback (most recent call last):
File "C:/Python27/automation scripts/ggulf/large data.py", line 29, in <module>
link = browser.find_element_by_link_text('node').click
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 276, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 684, in find_element
{'using': by, 'value': value})['value']
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 195, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 170, in check_response
raise exception_class(message, screen, stacktrace)
NoSuchElementException: Message: Unable to locate element: {"method":"link text","selector":"node"}
Stacktrace:
at FirefoxDriver.prototype.findElementInternal_ (file:///c:/users/akrakhan/appdata/local/temp/tmppveyk8/extensions/fxdriver@googlecode.com/components/driver-component.js:10299)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///c:/users/akrakhan/appdata/local/temp/tmppveyk8/extensions/fxdriver@googlecode.com/components/driver-component.js:603)
答案 0 :(得分:0)
您最好寻找更具体的内容,而不是浏览页面源代码。所有公司链接都是H2
标记内的链接。您可以使用CSS选择器h2 > a
找到它们,该选择器会查找所有A
标记,这些标记是(&gt;)h2
元素的子标记。
browser.get("http://ae.bizdirlib.com/taxonomy/term/1493") # Load page
links = browser.find_elements_by_css_selector("h2 > a")
for link in links:
link.click
这不是最终的解决方案,因为点击该链接会将您从主页面上移除,但它与您尝试完成的内容并行。可能更好的方法是将所有公司链接的URL存储在字符串数组中,然后遍历该数组导航到每个URL ...或类似的东西。为读者练习......:)