我有一个控制器,它将一个url作为参数传递,我试图在该URL上刮取整个页面。但是当我尝试阅读网址时,我收到以下错误:No such file or directory @ rb_sysopen - www.google.com
控制器:
lass PageScraperController < ApplicationController
require 'nokogiri'
require 'open-uri'
require 'diffy'
require 'htmlentities'
def scrape
require 'open-uri'
@url = watched_link_params.to_s
@url = @url.slice(9..@url.length-3)
puts "LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOG#{@url}"
page = Nokogiri::HTML(open(@url))
# coder = HTMLEntities.new
# encodedHTML = coder.encode(page)
puts page
end
def watched_link_params
params.require(:default).permit(:url)
end
end
答案 0 :(得分:1)
试试这个:
def scrape
@url = watched_link_params[:url]
page = Nokogiri::HTML(open(@url))
puts page
end
您需要传递整个网址,包括协议指示符;也就是说,您需要使用http://www.google.com
代替www.google.com
:
>> params = ActionController::Parameters.new(default: {url: 'http://www.google.com'})
>> watched_link_params = params.require(:default).permit(:url)
>> @url = watched_link_params[:url]
"http://www.google.com"
>> page = Nokogiri::HTML(open(@url))