Question

我使用Dashing监控趋势和网站统计信息。我创建了一个工作来检查GooglesNews趋势和Twitter趋势。

数据显示良好，但是，它们在第一次加载时出现，然后更新更新。有twitter_trends.rb的代码：

require 'nokogiri'
require 'open-uri'

url = 'http://trends24.in/france/~cloud'

data = Nokogiri::HTML(open(url))
list = data.xpath('//ol/li')

tags = list.collect do |tag|
  tag.xpath('a').text
end

tags = tags.take(10)
tag_counts = Hash.new({value: 0})

SCHEDULER.every '10s' do
  tag = tags.sample
  tag_counts[tag] = {label: tag}

  send_event('twitter_trends', {items: tag_counts.values})
end

我认为我使用了糟糕的“rufus-scheduler”来安排我的工作https://gist.github.com/pushmatrix/3978821#file-sample_job-rb

如何定期更新数据？

Answer 1

Your scheduler looks fine, but it looks like you're making one call to the website:

data = Nokogiri::HTML(open(url))

But never calling it again. Is your intent to only check that site once along with the initial processing of it?

I assume you'd really want to wrap more of your logic into the scheduler loop - only things in there will be rerun when the schedule job hits.

Answer 2

当您覆盖调度程序中的所有内容时，您每10秒钟只采集一个样本（http://ruby-doc.org/core-2.2.0/Array.html#method-i-sample），然后将其添加到tag_counts。这是每次清除标签。要记住调度程序，它每次运行时基本上都是一个干净的名单。我建议循环遍历标记并将其添加到tag_counts，而不是采样。每次运行调度程序时，取样都是不必要的，因为你将它减少到10个。

Answer 3

如果我像这样移动SCHEDULER（在顶部的url之后），它可以工作，但每10秒只有一个项目随机出现。

require 'nokogiri'
require 'open-uri'

url = 'http://trends24.in/france/~cloud'

SCHEDULER.every '10s' do

data = Nokogiri::HTML(open(url))
list = data.xpath('//ol/li')

tags = list.collect do |tag|
  tag.xpath('a').text
end

tags = tags.take(10)
tag_counts = Hash.new({value: 0})

  tag = tags.sample
  tag_counts[tag] = {label: tag}

  send_event('twitter_trends', {items: tag_counts.values})
end

如何显示10个项目的列表，这些项目会定期更新？

使用Dashing和Ruby更新作业

3 个答案: