如何切片和解析通过流API接收的数据?

时间:2016-06-27 13:22:53

标签: ruby regex streaming meetup

我正在尝试连接到Meetup流HTTP API,并在不同的记录中解析收到的事件。 我在Sinatra上使用红宝石。我选择了' em-http-request'宝石处理连接和“瘦”#39;作为服务器。 搜索有关如何处理API流的信息,我发现的所有内容大约是3到6年,this is the best example that I found

在此示例中,作者使用正则表达式查找每条推文的结尾并将它们拆分为不同的记录。在我的情况下,我没有找到分割meetUp事件流的方法。

这是我的代码:

get '/' do
STREAMING_URL = 'http://stream.meetup.com/2/open_events'
http = EM::HttpRequest.new(STREAMING_URL).get
buffer = ""
http.stream do |chunk|
    buffer += chunk
    while event = buffer.slice!(/{\"utc_offset\"+.../)
        eventRecord = event
        puts eventRecord
    end
end

我打开了对@stream.meetup.com/2/open_events的连接调用,我开始收到一个随机剪切的字符串流:

{"utc_offset":-14400000,"venue":{"country":"us","city":"Novi","address_1":"43155 Main St Suite 2300N","name":"Game of Clues Escape Room","lon":-83.470833,"state":"MI","lat":42.478107},"rsvp_limit":0,"venue_visibility":"public","visibility":"public","maybe_rsvp_count":0,"description":"<p>Search the Emerald City for clues to help you solve riddles and puzzles to escape the room before the 60 minute timer is up. Work together to complete missions that will bring your group closer together in order to get a clue card.<\/p> \n<p>25.00 per person. Book online at www.gameofclues.com<\/p>","mtime":1467030326494,"event_url":"http:\/\/www.meetup.com\/Escape-Room-Lovers\/events\/232071150\/","yes_rsvp_count":1,"duration":3600000,"payment_required":"0","name":"Game of Clues Escape Room Novi,Mi","id":"232071150","time":1467507600000,"group":{"join_mode":"open","country":"us","city":"Novi","name":"Escape Room Lovers","group_lon":-83.52,"id":20101745,"state":"MI","urlname":"Escape-Room-Lovers","category":{"name":"games","id":11,"shortname":"games"},"group_lat":42.47},"status":"upcoming"}{"utc_offset":14400000,"venue":{"country":"ae","city":"Dubai","address_1":"Jumeirah Lake Towers, outside Dubai Marina Metro","name":"Illuminations Well-Being Center, 409, Fortune Executive Towers, Cluster T, Plot T1, ","lon":55.311668,"lat":25.264444},"rsvp_limit":0,"venue_visibility":"public","visibility":"public","maybe_rsvp_count":0,"description":"<p>Facilitator: Dr. Beryl Bazley<\/p> \n<p>Investment: Free!<\/p> \n<p>For more information call

我尝试使用 .slice!缓冲区的内容使用as param &#34; {&#34; utc_offset&#34;&# 34; 这是在每个事件的开始时出现的子字符串,但是我无法弄清楚如何编写正则表达式,它将所有包含在子字符串的每个元素之间的内容包含在整个事件中。

此外,我不确定将块添加到变量缓冲区,然后使用 .slice!方法,这是获取每个块的最佳方式事件

解决这种情况的最佳方法是什么?

如何切片和解析通过流API接收的数据?

在此我添加@jordan在评论中提出的解决方案的实施:

require 'yajl'
require 'uri' 
require 'yajl/http_stream'


@parser = Yajl::Parser.new(:symbolize_keys => true)

STREAMING_URL = 'http://stream.meetup.com/2/open_events'
Yajl::HttpStream.get(STREAMING_URL, :symbolize_keys => true) do |hash|
    puts hash.inspect
    hash.each {|key, value| puts "#{key} is #{value}" }
end

0 个答案:

没有答案