首先,我一直在尝试从此网页获取下拉菜单:http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl
这是我的代码:
import urllib2
from bs4 import BeautifulSoup
import re
from pprint import pprint
from selenium import webdriver
url = 'http://solutions.3m.com/wps/portal/3M/en_US/Interconnect/Home/Products/ProductCatalog/Catalog/?PC_Z7_RJH9U5230O73D0ISNF9B3C3SI1000000_nid=RFCNF5FK7WitWK7G49LP38glNZJXPCDXLDbl'
element_xpath = '//*[@id="Component1"]'
driver = webdriver.PhantomJS()
driver.get(url)
element = driver.find_element_by_xpath(element_xpath)
element_xpath = '/option[@value="02"]'
all_options = element.find_elements_by_tag_name("option")
for option in all_options:
print("Value is: %s" % option.get_attribute("value"))
option.click()
source = driver.page_source.encode('utf-8', 'ignore')
driver.quit()
source = str(source)
soup = BeautifulSoup(source, 'html.parser')
print soup
打印出来的是:
Traceback (most recent call last):
File "../../../../test.py", line 58, in <module>
Value is: XX
main()
File "../../../../test.py", line 46, in main
option.click()
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 54, in click
self._execute(Command.CLICK_ELEMENT)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webelement.py", line 228, in _execute
return self._parent.execute(command, params)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/webdriver.py", line 165, in execute
self.error_handler.check_response(response)
File "/home/eric/dev/octocrawler-env/local/lib/python2.7/site-packages/selenium-2.33.0-py2.7.egg/selenium/webdriver/remote/errorhandler.py", line 158, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: u'{"errorMessage":"Element is not currently visible and may not be manipulated","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"81","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:51413","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\\"sessionId\\": \\"30e4fd50-f0e4-11e3-8685-6983e831d856\\", \\"id\\": \\":wdc:1402434863875\\"}","url":"/click","urlParsed":{"anchor":"","query":"","file":"click","directory":"/","path":"/click","relative":"/click","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/click","queryKey":{},"chunks":["click"]},"urlOriginal":"/session/30e4fd50-f0e4-11e3-8685-6983e831d856/element/%3Awdc%3A1402434863875/click"}}' ; Screenshot: available via screen
最令人愤怒的最令人愤怒的是,有时它实际上都是有效的。我不知道这里发生了什么。
更新
似乎我对其他网站上的下拉表单的可见性没有任何问题,只是这个。是否有某些东西可能使形式不可见(如果是这样,为什么只有95%的时间)?加载页面时是否会出现可能导致无法显示的问题?
答案 0 :(得分:3)
在这里使用Selenium Webdriver和Python3我是如何做到的:
all_options = self.driver.find_element_by_id("Component1")
options = all_options.find_elements_by_tag_name("option")
for each_option in all_options:
print(each_option.get_attribute("value"))
如果您尝试从文本框中选择某些内容,请执行以下操作:
select = Select(self.driver.find_element_by_id("Component1"))
select.select_by_visible_text("02")
答案 1 :(得分:0)
尝试使用xpath,如下所示:
使用xpath下方查找元素并单击它。
Xpath://select[@id='Component1']/option[text()='04']
如果上述代码不起作用,请先使用以下命令单击下拉列表
xpath://select[@id='Component1']
然后点击选项。
答案 2 :(得分:0)
from selenium.webdriver.support.ui import WebDriverWait, Select
from selenium.webdriver.common.by import By
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID "Component1")))
select = WebDriverWait(driver, 10).until(lambda driver:Select(driver.find_element_by_id("Component1")))
select.select_by_visible_text("Text to look for")