Question

我是Ruby和Selenium的新手。

我正在尝试编写一个打开网页的脚本（不是我运行的页面），浏览那里的项目列表，然后单击符合特定条件的项目的“详细信息”链接。页面的精简版本是：

<div class="list">

<div class="item">
    <div class="description">Cat</div>
    <div class="price">$3.00</div>
    <div class="detailslink">
        <a href="http://a.htm">Details</a>
    </div>
</div>

<div class="item">
    <div class="description">Dog</div>
    <div class="price">$4.00</div>
    <div class="detailslink">
        <a href="http://b.htm">Details</a>
    </div>
</div>

<div class="item">
    <div class="description">Cat</div>
    <div class="price">$4.00</div>
    <div class="detailslink">
        <a href="http://c.htm">Details</a>
    </div>
</div>

<div class="item">
    <div class="description">Bird</div>
    <div class="price">$3.00</div>
    <div class="detailslink">
        <a href="http://d.htm">Details</a>
    </div>
</div>

我想要做的一个例子是点击“详细信息”链接，查找最贵的非宠物动物。我猜我会创建一个包含所有“item”类元素的数组，其find_elements不包含单词“dog”，找到该数组中最高价的索引，然后单击相应“detailslink”中的链接“，但我不知道如何用Ruby写出来。

如果没有符合条件的列表项（在“list”div中没有“item”div，或者所有“list”div包含Cat），理想情况下它也会每30秒刷新一次。这是我到目前为止（我知道它遗失了很多！）：

require "selenium-webdriver"
browser = Selenium::WebDriver.for :chrome
browser.get "http://list.htm"
for i in 0..1
    items = browser.find_elements(:class=>"item")
    #Do testing here. If there are non-cats, get the index of the max.
        break
    end
    sleep(30)
    browser.get "http://list.htm"
    redo
end
#find the nth element based on the test above
browser.find_element(:class, "detailslink")[index].click

非常感谢任何帮助！

Answer 1

对于我们这些使用Ruby的人来说，find_element和find_elements的使用方法有一个不错的表格。

https://gist.github.com/huangzhichong/3284966#file-selenium-webdriver-cheatsheet-md

Answer 2

我认为没有通用解决方案，但针对您的具体示例：

browser = Selenium::WebDriver.for :firefox
browser.navigate.to 'C:\Scripts\Misc\Programming\Selenium-Webdriver\test.htm'

# Refresh the page until there is at least 1 dog
items = browser.find_elements(:class=> 'item')
dog_items = items.find_all{ |item| item.find_element(:class => 'description').text == 'Dog' }   
while dog_items.length == 0
  sleep(30)
  browser.navigate.refresh
  items = browser.find_elements(:class=> 'item')
  dog_items = items.find_all{ |item| item.find_element(:class => 'description').text == 'Dog' }     
end

# Select the dog with the greatest price
most_expensive = dog_items.sort_by{ |dog| dog.find_element(:class => 'price').text.delete('$').to_f }.last

# Click the selected dog
most_expensive.find_element(:css => '.detailslink a').click

Answer 3

我从来没有真正尝试使用Selenium但是使用nokogiri它会是这样的（我为了清晰起见使其更加冗长，显然某些方法可以被链接）

require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://list.htm"))
items = doc.css(".item")
non_dog_items = items.reject{|item| item.children.css(".description").text == "Dog"}
most_expensive_non_dog_item = non_dog_items.max_by{|item| item.children.css(".price").text.gsub("$",'').to_f}
link_to_most_expensive_non_dog_item = most_expensive_non_dog_item.css(".detailsLink a").attributes["href"].value
#=> "http://c.htm"

唯一的问题是，如果您的两个商品的价格相同，那么max_by会返回价格最高的第一个商品。

您还可以将所有项目作为哈希值返回，然后仅处理哈希值

require 'open-uri'
require 'nokogiri'

doc = Nokogiri::HTML(open('/scripts/test.html'))
items = doc.css(".item").reject{|item| item.css(".description").text == "Dog"}
items_hash = items.map do |item|
      {description: item.css(".description").text,
       price: item.css(".price").text.gsub("$",'').to_f,
       link: item.css(".detailsLink a").attributes["href"].value
      }
    end
#=> [{:description=>"Cat", :price=>3.0, :link=>"http://a.htm"},{:description=>"Cat", :price=>4.0, :link=>"http://c.htm"},{:description=>"Bird", :price=>3.0, :link=>"http://d.htm"}]

使用Ruby和Selenium Webdriver find_elements选择并根据条件单击特定链接

3 个答案: