使用Ruby的Web Scraping - 如果声明

时间:2017-02-04 19:35:59

标签: ruby-on-rails ruby web-scraping

我已经建立了一个网络刮刀。我需要它来刮取特定社区的价格和卧室。有时span.first_detail_cell将返回Furnished,其余时间会返回价格。我需要写一些可以忽略span.first_detail_cell的东西,如果它已经提供,并查看下一个单元格的价格。我想我需要写一个if语句,但不确定参数。任何帮助都会很棒!

require 'open-uri'
require 'nokogiri'
require 'csv'

url = "https://streeteasy.com/for-rent/bushwick"
page = Nokogiri::HTML(open(url))

page_numbers = []
page.css("nav.pagination span.page a").each do |line|
  page_numbers << line.text
end

max_page = page_numbers.max

beds = []
price = []

max_page.to_i.times do |i|

  url = "https://streeteasy.com/for-rent/bushwick?page=#{i+1}"
  page = Nokogiri::HTML(open(url))

  page.css('span.first_detail_cell').each do |line|
    beds << line.text
  end

  page.css('span.price').each do |line|
    price << line.text
  end

end

CSV.open("bushwick_rentals.csv", "w") do |file|
  file << ["Beds", "Price"]

  beds.length.times do |i|
    file << [beds[i], price[i]]
  end
end

1 个答案:

答案 0 :(得分:1)

  page.css('span.first_detail_cell').each do |line|
    if line.text.include?("Furnished")
      # do something hre
    else
      beds << line.text
    end
  end