在selenium中访问两个下拉菜单

时间:2016-04-06 01:12:57

标签: python html selenium drop-down-menu web-scraping

我正在尝试通过网络废弃此网站:http://surge.srcc.lsu.edu/s1.html 但它总是选择最后一个值,即使循环正在运行正确的第一个下拉,但随后它会出错并切换到最后一个值,第二个下拉值的最后一个值。我认为错误是在选择第一个后加载第二个下拉框?我似乎无法解决它。

# importing libraries
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Firefox()
driver.get("http://surge.srcc.lsu.edu/s1.html")

# definition for switching frames
def frame_switch(css_selector):
  driver.switch_to.frame(driver.find_element_by_css_selector(css_selector))  

frame_switch("iframe")

html_source = driver.page_source  
element = driver.find_element_by_xpath('//select[@id="storm_name"]')
options = element.find_elements_by_tag_name("option")

optionsList = []

for option in options: #iterate over the options, place attribute value in list
    optionsList.append(option.get_attribute("value"))
    driver.implicitly_wait(30)

for optionValue in optionsList: # looping through the first drop down
    print ("starting loop on option %s" % optionValue)
    option.click()
    html_source = driver.page_source  
    time.sleep(3)

    element_year = driver.find_element_by_xpath('//select[@id="year"]')
    options_year = element_year.find_elements_by_tag_name("option")
    optionsList2 = []

    for option in options_year: #iterate and make list for second drop down
        optionsList2.append(option.get_attribute("value"))

    for optionValue in optionsList2: # loop through Second drop down
        print ("starting loop on option %s" % optionValue)
        option.click()
        time.sleep(3) 

2 个答案:

答案 0 :(得分:3)

我建议改用Select,然后按索引选择每个选项。见下面的代码

html_source = driver.page_source
nameSelect = Select(driver.find_element_by_xpath('//select[@id="storm_name"]'))
stormCount = len(nameSelect.options)

for i in range(1, stormCount):
    print("starting loop on option storm " + nameSelect.options[i].text)
    nameSelect.select_by_index(i)
    time.sleep(3)
    html_source = driver.page_source

    yearSelect = Select(driver.find_element_by_xpath('//select[@id="year"]'))
    yearCount = len(yearSelect.options)
    for j in range(1, yearCount):
        print("starting loop on option year " + yearSelect.options[j].text)
        yearSelect.select_by_index(j)
        time.sleep(3)
        html_source = driver.page_source

答案 1 :(得分:0)

您的代码中存在多个问题。 我重写了它并添加了一些内容。

# importing libraries
from selenium import webdriver
import time
driver = webdriver.Firefox()
driver.get("http://surge.srcc.lsu.edu/s1.html")

# definition for switching frames
def frame_switch(css_selector):
  driver.switch_to.frame(driver.find_element_by_css_selector(css_selector))  

frame_switch("iframe")

html_source = driver.page_source  #Why do you even need this?
                                  #I am assuming you are following a tutorial, however this line could be omitted
element = driver.find_element_by_xpath('//select[@id="storm_name"]')
options = element.find_elements_by_tag_name("option")

for option in options[1:]: #Using [1:] alias List Slicing will skip the first element in options which is -1 ("Choose a storm")
    print ("Currently looping over Storm {}.".format(option.get_attribute("value")))
    option.click()
    time.sleep(2) #It would be a lot better to use explicit or implicit wait but thats for another day.

    element_year = driver.find_element_by_xpath('//select[@id="year"]')
    options_year = element_year.find_elements_by_tag_name("option")

    for option in options_year[1:]: #Same as above with ("Choose a Year")
        print ("Currently looping over Year {}.".format(option.get_attribute("value")))
        time.sleep(2) #Same as above
        option.click()

这将做你想要的。但是我建议你做多件事:

  1. 了解Python List Slicing
  2. 在代码中使用logging / print以检查列表和dicts的值和对象。重复的-1对您来说应该是可疑的,以及选项始终指向Wilma的事实。
  3. 定期清理您的导入和代码。如果一个变量永远不被引用,那么它通常意味着它是无关紧要的。 (也可以从其他命名空间导入,注意这一点。)
  4. 祝你好运并继续学习。