如何通过selenium驱动程序从“<div class =”ui-helper-hidden-accessible“>”中提取选项?

时间:2017-09-29 03:01:00

标签: javascript python html selenium

我想使用selenium和python来抓取这个网站:https://ntrl.ntis.gov/NTRL

但是,当我想更改下拉列表的年份时,它无法正常工作。

以下是HTML:

const path = require('path');
const webpack = require('webpack');

module.exports = {
  entry: ['./app/index.js'],
  output: {
    path: __dirname + '.build',
    filename: 'bundle.js'
  },
  plugins: [
    new webpack.LoaderOptionsPlugin({
      options: {
        module: {
          loaders: [
            {
              loader: 'babel-loader',
              test: /\.jsx?$/,
              exclude: /node_modules/,
              query: {
                presets: ['es2015', 'react']
              }
            }
          ]
        }
      }
    })
  ],
  devServer: {
    port: 3000,
    contentBase: './build',
    inline: true
  }
}

以下是我的代码:

<div id="advSearchForm:FromYear" class="ui-selectonemenu ui-widget ui-state-default ui-corner-all" style="min-width: 63px;">
    <div class="ui-helper-hidden-accessible">
        <input id="advSearchForm:FromYear_focus" name="advSearchForm:FromYear_focus" type="text" autocomplete="off" role="combobox" aria-haspopup="true" aria-expanded="false" readonly="readonly" aria-autocomplete="list" aria-owns="advSearchForm:FromYear_items" aria-activedescendant="advSearchForm:FromYear_0" aria-describedby="advSearchForm:FromYear_0" aria-disabled="false">
    </div>
    <div class="ui-helper-hidden-accessible">
        <select id="advSearchForm:FromYear_input" name="advSearchForm:FromYear_input" tabindex="-1">
            <option value="*" selected="selected">&lt;1900</option>
            <option value="1900">1900</option>
            <option value="1901">1901</option>
            <option value="1902">1902</option>
            <option value="1903">1903</option>
        </select>
    </div>
    <label id="advSearchForm:FromYear_label" class="ui-selectonemenu-label ui-inputfield ui-corner-all">&lt;1900</label>
    <div class="ui-selectonemenu-trigger ui-state-default ui-corner-right">
        <span class="ui-icon ui-icon-triangle-1-s ui-c"/>
    </div>
</div>

但它有例外:

select = Select(driver.find_element_by_xpath(".//div[@id='advSearchForm:FromYear']/div[2]/select"))
select.select_by_value("1902") 

我尝试使用js脚本:

Element is not currently visible and may not be manipulated

但它也不起作用,我测试driver.execute_script("document.getElementById('advSearchForm:FromYear_input').options[2].selected = 'true'") 可以在其他下拉列表中使用,所以它可能是select.select_by_value(xxx)的麻烦,所以我该如何处理呢?

1 个答案:

答案 0 :(得分:0)

我建议使用click事件点击元素(Select元素,其ID为&#34; advSearchForm:FromYear_input&#34;)首先点击{{3}等待元素可见,然后您应该能够使用select_by_value方法更改年份。

此外,我会避免使用XPath并改为使用ExplicitWait event,更好的方法是创建CSS selector以减少将来保持工具正常运行所需的工作更新。

很抱歉我无法提供更多帮助,我对python并不熟悉。

您也可以参考Page Object Model

修改

看起来它使用option中的select项作为主列表,实际选择发生在页面下方的另一个元素内。这个元素是用Javascript动态构建的,所以我在评论中的建议是行不通的。

我已经用C#破解了一个有效的应用程序,让你知道你需要做什么:

private static void Main(string[] args)
{
    // ':' has a special meaning in CSS selectors so we need to escape it using \\
    const string dropdownButtonSelector = "div#advSearchForm\\:datePublPanel div.ui-selectonemenu-trigger";
    // {0} is a placeholder which is used to insert text during runtime
    const string dynamicallyBuiltListItemSelectorTemplate = "ul#advSearchForm\\:FromYear_items li[data-label=\"{0}\"]";
    // Rather than being a constant this value will be determined at runtime
    const string valueToSelect = "1902";

    // Setup driver and wait
    ChromeDriver driver = new ChromeDriver();
    WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(5));

    // Load page
    driver.Navigate().GoToUrl("https://ntrl.ntis.gov/NTRL/");
    // Wait until the first (index 0) dropdown list button inside the publication date dive is deemed "clickable"
    wait.Until(ExpectedConditions.ElementToBeClickable(driver.FindElementsByCssSelector(dropdownButtonSelector)[0]));

    Console.WriteLine("Element is visible");

    // Open the dropdown list
    driver.FindElementsByCssSelector(dropdownButtonSelector)[0].Click();

    Console.WriteLine("Dropdown should be open");

    // Select the element from the dynamic Javascript built list
    string desiredValueListItemSelector = string.Format(dynamicallyBuiltListItemSelectorTemplate, valueToSelect);
    driver.FindElementByCssSelector(desiredValueListItemSelector).Click();

    Console.WriteLine($"Selected value {valueToSelect} using selector: {desiredValueListItemSelector}");
    Console.ReadLine();

    driver.Close();
}

=============================================== ===========================

Edit2

包含python答案,我之前从未编写过python,但这似乎有效。我强烈建议查看上面发布的一些关于使用PageObject模型和显式等待的链接,以及避免使用XPATH选择器。

from selenium import webdriver
from time import sleep

# Set the year to select
fromYearToSelect = "1902"

# Create the driver and load the page
driver = webdriver.Chrome("C:\chromedriver_win32\chromedriver.exe")
driver.get("https://ntrl.ntis.gov/NTRL/")

# Find and click the "From" dropdown elems[1] is the "To" dropdown
elems = driver.find_elements_by_css_selector("div#advSearchForm\\:datePublPanel div.ui-selectonemenu-trigger")
elems[0].click()

# Select the year
driver.find_element_by_css_selector("#advSearchForm\\:FromYear_items li[data-label='{0}']".format(fromYearToSelect)).click()

# Wait to see the results (we should be using an Explicit Wait  here)
sleep(2)

# Close the driver
driver.close()