使用跨度标签中的Nokogiri废料数据

时间:2013-09-24 19:45:51

标签: javascript ruby web-scraping screen-scraping nokogiri

如何使用nokogiri废弃页面http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?Type=polo+neck的产品的名称和价格,以及如何废弃该类别的所有产品作为分页。以下是我得到价格的代码,但是在HTML中标签,仅适用于1页。

require 'nokogiri'
require 'open-uri'

url = "http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?  Type=polo+neck"
doc = Nokogiri::HTML(open(url))
doc.css(".prodListing-item").each do |dv|
product_name = dv.at_css('.prod-name').text unless dv.at_css(".prod-name").nil?
product_price = dv.at_css('.price-info span span:nth-child(2)').to_s 
puts product_name + product_price
end

1 个答案:

答案 0 :(得分:1)

Following is the code which resolved the issue
require 'nokogiri'
require 'open-uri'

number=1
while true
url="http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?  Type=polo+neck&page=#{number}"
doc = Nokogiri::HTML(open(url))
products=doc.css(".prodListing-item")
break if products.size == 0
products.each do |item|
product_name = item.at_css('.prod-name').text unless item.at_css(".prod-name").nil?
product_price = item.at_css('.price-info span span:nth-child(2)').text unless     item.at_css(".price-info span span:nth-child(2)").nil?
puts product_name +"<==========>" +product_price
end
puts "page" +"#{number}"
number += 1

end
puts "exit of the while loop"