我需要流式传输推文并将其存储在mongodb中进行处理。我已经安装了ruby以及mongo和tweetstream gem。
我运行以下代码来提取推文并将其存储在mongodb的“tweet”数据库中名为“users”的集合中。这是程序rawks.rb
require "tweetstream"
require "mongo"
require "time"
db = Mongo::Connection.new("localhost", 27017).db("tweet")
tweets = db.collection("users")
TweetStream::Daemon.new("username","password","scrapedaemon").on_error do |message|
# Log your error message somewhere
end.filter({"locations" => "-12.72216796875, 49.76707407366789, 1.977539, 61.068917"}) do |status|
# Do things when nothing's wrong
data = {"created_at" => Time.parse(status.created_at), "text" => status.text, "geo" => status.geo, "coordinates" => status.coordinates, "id" => status.id, "id_str" => status.id_str}
tweets.insert({"data" => data});
end
当我运行此文件时,我收到以下错误: 来自rawks.rb:8:'new' rawks.rb:8:在''
在文件daemon.rb中,40:'initialize'错误的参数个数(3个用于2)参数错误
这是daemon.rb文件
require 'daemons'
# A daemonized TweetStream client that will allow you to
# create backgroundable scripts for application specific
# processes. For instance, if you create a script called
# <tt>tracker.rb</tt> and fill it with this:
#
# require 'rubygems'
# require 'tweetstream'
#
# TweetStream.configure do |config|
# config.consumer_key = 'abcdefghijklmnopqrstuvwxyz'
# config.consumer_secret = '0123456789'
# config.oauth_token = 'abcdefghijklmnopqrstuvwxyz'
# config.oauth_token_secret = '0123456789'
# config.auth_method = :oauth
# end
#
# TweetStream::Daemon.new('tracker').track('intridea') do |status|
# # do something here
# end
#
# And then you call this from the shell:
#
# ruby tracker.rb start
#
# A daemon process will spawn that will automatically
# run the code in the passed block whenever a new tweet
# matching your search term ('intridea' in this case)
# is posted.
#
class TweetStream::Daemon < TweetStream::Client
DEFAULT_NAME = 'tweetstream'.freeze
DEFAULT_OPTIONS = {:multiple => true}
attr_accessor :app_name, :daemon_options
# The daemon has an optional process name for use when querying
# running processes. You can also pass daemon options.
def initialize(name = DEFAULT_NAME, options = DEFAULT_OPTIONS)
@app_name = name
@daemon_options = options
super({})
end
def start(path, query_parameters = {}, &block) #:nodoc:
Daemons.run_proc(@app_name, @daemon_options) do
super(path, query_parameters, &block)
end
end
end
答案 0 :(得分:0)
你做
TweetStream::Daemon.new("username","password","scrapedaemon")
有三个参数,但应该只有两个,第二个是选项的散列:
initialize(name = DEFAULT_NAME, options = DEFAULT_OPTIONS)
(似乎有不同的文档,ruby-doc.org上的文档显示了您尝试的用法,但您使用的源代码看起来更像这里所描述的:http://rdoc.info/github/intridea/tweetstream/TweetStream/Daemon)