我正在做以下Nokogiri教程:http://hunterpowers.com/data-scraping-and-more-with-ruby-nokogiri-sinatra-and-heroku/
所以我试图在终端中启动这个脚本:
require 'nokogiri'
require 'open-uri'
url = "http://www.930.com/concerts/#/930/"
data = Nokogiri::HTML(open(url))
# Here is where we use the new method to create an object that holds all the
# concert listings. Think of it as an array that we can loop through. It's
# not an array, but it does respond very similarly.
concerts = data.css('.concert_listing')
concerts.each do |concert|
# name of the show
puts concert.at_css('.event').text
# date of the show
puts concert.at_css('.date').text
# time of the show
puts concert.at_css('.doors').text
# show price or sold out
# Remember, when a show is sold out, there is no div with the selector .price
# What we are doing here is setting price = to that selector. We then test
# to see whether it is nil or not which let's us know if the show is SOLD OUT.
price = concert.at_css('.price')
if !price.nil?
puts price.text
else
puts "SOLD OUT"
end
# blank line to make results prettier
puts ""
end
$ ruby interesting.rb
但没有任何反应:
alex @ alex-K43U:〜/ rails / nokogiri $ ruby interesting.rb
亚历克斯@亚历克斯-K43U:〜/轨道/引入nokogiri $
我过去常常使用Rails做所有事情,所以现在从一个空文件夹开始对我来说似乎有点混乱。
如何在此文件夹中安装gem,如何正确启动脚本?
答案 0 :(得分:1)
对我来说很正常!你确定这一行:
concerts = data.css('.concert_listing')
会导致concerts
中有任何可枚举的内容吗?你有没有尝试过它?
puts concerts
答案 1 :(得分:1)
如果您访问该站点,并弹出浏览器控制台并检查该页面,您将看到他们更改了音乐会的css类,因此它不再是.concert_listing
。
分析网站,看看你可以获取什么,以及如何使用Nokogiri获取它。