我正在尝试为多个subreddits请求json页面,并从大学项目的每个页面获取标题和链接。这是有问题的代码:
require 'rufus-scheduler'
require 'json'
require 'httparty'
ENV['TZ'] = 'Europe/Dublin'
scheduler = Rufus::Scheduler::singleton
scheduler.every '12h00m', :first_at => Time.now + 10 do
array_of_subreddits = ["pics", "memes", "funny", "aww", "memes",
"birdswitharms"]
array_of_subreddits.each do |category|
sleep 10 #wait 10 seconds between each request
@response = JSON.parse(HTTParty.get("http://reddit.com/r/#{category}/.json?limit=25").body)
@response['data']['children'].each do |data|
@link = data['data']['url']
@title = data['data']['title']
@category = category
Pic.create([{:title => "#{@title}", :link => "#{@link}", :category => "#{@category}"}])
end
end
end
这有时完美无缺,它会贯穿每一个并按照它应该结束。但是,经过一两次通过后,它会给我这个消息:
NoMethodError (undefined method `[]' for nil:NilClass):
app/controllers/home_controller.rb:17:in `block in index'
app/controllers/home_controller.rb:9:in `each'
app/controllers/home_controller.rb:9:in `index'
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/actionpack-4.2.6/lib/action_dispatch/middleware/templates/rescues/_source.erb (4.8ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/actionpack-4.2.6/lib/action_dispatch/middleware/templates/rescues/_trace.html.erb (2.2ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/actionpack-4.2.6/lib/action_dispatch/middleware/templates/rescues/_request_and_response.html.erb (1.2ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/actionpack-4.2.6/lib/action_dispatch/middleware/templates/rescues/diagnostics.html.erb within rescues/layout (66.2ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/_markup.html.erb (0.4ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/_inner_console_markup.html.erb within layouts/inlined_string (0.3ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/_prompt_box_markup.html.erb within layouts/inlined_string (0.3ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/style.css.erb within layouts/inlined_string (0.5ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/console.js.erb within layouts/javascript (51.6ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/main.js.erb within layouts/javascript (0.3ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/error_page.js.erb within layouts/javascript (0.5ms)
Rendered /Users/conorbreen/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/web-console-2.3.0/lib/web_console/templates/index.html.erb (124.8ms)

答案 0 :(得分:2)
创建客户端类是使用httparty的更好方法:
class RedditClient
include HTTParty
format :json
base_uri "http://reddit.com/r/"
def self.get_category(category, *opts)
opts.reverse_merge(limit: 25)
get("/#{category}.json", opts)
end
end
这样,HTTParty为我们处理JSON解析,并且不会尝试转换空响应。它也更容易单独测试。
但是,在尝试使用之前,您仍应检查响应是否成功:
@response = RedditClient.get_category(category)
if @response.success?
attrs = @response['data']['children'].map do |child|
{
category: category,
link: child['data']['url'],
title: child['data']['title']
}
end
Pic.create!(attrs)
else
# log it or raise some sort of error
end
请注意,您将包含单个哈希的数组传递给.create
。您可以改为传递一个哈希数组,它会将记录插入到单个SQL插入语句中。
答案 1 :(得分:1)
当你遇到这样的错误时,你应该总是转储实际的响应,以便你可以检查它。事实上,nil
的代码执行['data']['children']
等代码时出现错误,这意味着我猜你有一个JSON响应,但是错过了第一项之一(例如['data']
返回无)。
不要只假设每个请求都成功,很多事情都会导致HTTP失败。您可能会收到有效的JSON响应,而不是您期望的那个,例如错误消息会让告诉您问题。
即使延迟10秒,您也可能达到了速率限制(从未亲自测试过Reddit),但请阅读rules
许多默认的用户代理(如“Python / urllib”或“Java”)受到极大的限制,以鼓励使用唯一的描述性用户代理字符串。
答案 2 :(得分:0)
这种错误在ruby或rails中最常见。可以通过多种方式处理。正如@Stefan建议您可以使用以下任何一种。
大多数人都喜欢这个
response = HTTParty.get('http://reddit.com/r/#{category}/.json?limit=25')
if response.success?
response_body = response.body
# continue
end
或
response = HTTParty.get('http://reddit.com/r/#{category}/.json?limit=25')
case response.code
when 200
puts "Good!"
# Continue your parsing
when 404
puts "NOT FOUND!"
when 500...600
puts "ERROR #{response.code}"
end
或
begin
HTTParty.get('http://reddit.com/r/#{category}/.json?limit=25')
rescue HTTParty::Error
# HTTParty errors like Not found
rescue StandardError
# StandardError like Timeout
else
# continue
end