我已经建立了一个网络刮刀。我需要它来刮取特定社区的价格和卧室。有时span.first_detail_cell
将返回Furnished
,其余时间会返回价格。我需要写一些可以忽略span.first_detail_cell
的东西,如果它已经提供,并查看下一个单元格的价格。我想我需要写一个if语句,但不确定参数。任何帮助都会很棒!
require 'open-uri'
require 'nokogiri'
require 'csv'
url = "https://streeteasy.com/for-rent/bushwick"
page = Nokogiri::HTML(open(url))
page_numbers = []
page.css("nav.pagination span.page a").each do |line|
page_numbers << line.text
end
max_page = page_numbers.max
beds = []
price = []
max_page.to_i.times do |i|
url = "https://streeteasy.com/for-rent/bushwick?page=#{i+1}"
page = Nokogiri::HTML(open(url))
page.css('span.first_detail_cell').each do |line|
beds << line.text
end
page.css('span.price').each do |line|
price << line.text
end
end
CSV.open("bushwick_rentals.csv", "w") do |file|
file << ["Beds", "Price"]
beds.length.times do |i|
file << [beds[i], price[i]]
end
end
答案 0 :(得分:1)
page.css('span.first_detail_cell').each do |line|
if line.text.include?("Furnished")
# do something hre
else
beds << line.text
end
end