如何使用Python + Selenium从HTML代码中提取信息?

时间:2019-09-18 11:19:24

标签: python selenium selenium-webdriver

我必须获取出现在此代码中的选项,然后将所有选项都放入一个数组中,以便以后在GUI中显示这些选项,但是我不知道该怎么做。

<select name="flt_technology" id="flt_technology" size="8" tabindex="1" multiple="" onchange="onChangeMessage('block','TechnicalReports');">
    <option value="3303">Aeroelastic Stability</option>
    <option value="3305">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Aeroelastic Model</option>             
    <option value="3304">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Aeroelastic Stability Criteria</option>
    <option value="3308">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Aeroservoelastic Analysis</option>
    <option value="3306">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Control Surfaces Reversal/Effectiveness</option>
    <option value="3311">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Flutter Flight Test</option>
    <option value="3307">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Flutter</option>
    <option value="3309">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Flutter: Failure Conditions</option>
    <option value="3310">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Ground Vibration Test</option>
    <option value="3710">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Qualification Equipment Test</option>
    <option value="3588">Weight and Balance</option>
    <option value="3589">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Center of Gravity Limits</option>
    <option value="3590">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leveling Means</option>
    <option value="3591">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Loads Distribution - Weight X Cg Envelope Definition</option>
    <option value="3592">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Weight Limits</option>
</select>

3 个答案:

答案 0 :(得分:0)

通过<select>标签使用Select

element = self.find_element_by_id('flt_technology')
select = Select(element)
options = select.options

这将返回<option> WebElements

的列表

答案 1 :(得分:-1)

列出打印 选项中的文本,您必须为{{1诱导 WebDriverWait }},您可以使用以下任一Locator Strategies

  • 使用visibility_of_all_elements_located()

    CSS_SELECTOR
  • 使用select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "select#flt_technology[name='flt_technology']")))) print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "select#flt_technology[name='flt_technology'] option")))]) #or print([my_elem.text for my_elem in select.options])

    XPATH
  • 注意:您必须添加以下导入:

    select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//select[@id='flt_technology' and @name='flt_technology']"))))
    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//select[@id='flt_technology' and @name='flt_technology']//option")))])
    #or
    print([my_elem.text for my_elem in select.options])
    

答案 2 :(得分:-2)

类似这样的东西:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://www.the_website_you_want_to_scrape.com')
select_elem = driver.find_element_by_id('flt_technology')
options = [x for x in select_elem.find_elements_by_tag_name("option")]

如果需要这些选项的值,则:

for element in options:
    print(element.get_attribute("value"))