希望你能解决我的问题。
我使用ruby实现了一个手动运行的脚本。
但是在crontab中使用像*/1 * * * * cd /Users/diogo/workspace/outros/crawler_trf/ && /Users/diogo/.rvm/wrappers/ruby-2.3.0@crawler/ruby get_news.rb >> /tmp/crawler_trf.out
这样的行运行时出现了这个错误:
/Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/net/http.rb:882:in `rescue in block in connect': Failed to open TCP connection to www.folhadirigida.com.br:80 (getaddrinfo: nodename nor servname provided, or not known) (SocketError)
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/net/http.rb:879:in `block in connect'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/timeout.rb:91:in `block in timeout'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/timeout.rb:101:in `timeout'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/net/http.rb:878:in `connect'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/net/http.rb:863:in `do_start'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/net/http.rb:852:in `start'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:319:in `open_http'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:737:in `buffer_open'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:212:in `block in open_loop'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:210:in `catch'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:210:in `open_loop'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:151:in `open_uri'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:717:in `open'
from /Users/diogo/.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/open-uri.rb:35:in `open'
from get_news.rb:6:in `read_data'
from get_news.rb:44:in `<main>'
脚本文件包含以下代码......
require 'nokogiri'
require 'open-uri'
require 'mail'
def read_data
page = Nokogiri::HTML(open('http://www.folhadirigida.com.br/fd/Satellite/concursos/noticias-TRFRJES-area-de-apoio-2016-2000153335216/'))
title = page.css('#tblResult').css('tr')[1].css('a').first.text rescue nil
message = page.css('#tblResult').css('tr')[1].text rescue nil
file_name = 'trf'
last_title = File.open("/tmp/#{file_name}", 'r') { |file| file.read } rescue nil
if title && last_title != title
send_email(title, message)
File.open("/tmp/#{file_name}", 'w') { |file| file.write(title) }
else
send_email("Nada novo", "")
end
end
def send_email(title, message)
Mail.defaults do
delivery_method :smtp, {
:address => 'smtp.gmail.com',
:port => 587,
:domain => 'gmail.com',
:user_name => 'myemail',
:password => 'mypass',
:authentication => :plain,
:enable_starttls_auto => true
}
end
Mail.deliver do
to 'myemail'
from 'Concursos - Novidades TRF <myemail>'
subject 'Novidade sobre o TRF'
content_type 'text/html; charset=UTF-8'
body "<h1>#{title}</h1>#{message}"
end
end
read_data
如果我执行curl
+网址完美无缺......我不知道发生了什么。我需要一些解释和解决方案。
答案 0 :(得分:0)
嗯,当脚本作为cron运行时听起来像DNS问题。
我会尝试在您的脚本顶部require "resolv-replace.rb"
,就像它描述的那样here。
稍后在家我会试一试,我会在我的Mac上测试它。