查找数组中与前一个元素不同的元素

时间:2014-05-09 07:56:38

标签: ruby arrays

我有一个描述系统内部状态的事件列表。系统可以播放或缓冲。只有"数据"状态才会改变。部分变化。说我有这些事件:

events = [
{ timestamp: 1399621649624, data: "buffering" }, 
{ timestamp: 1399621649912, data: "playing" }, 
{ timestamp: 1399621655253, data: "buffering" }, 
{ timestamp: 1399621655536, data: "playing" }, 
{ timestamp: 1399621661537, data: "playing" }, 
{ timestamp: 1399621662404, data: "buffering" }, 
{ timestamp: 1399621662745, data: "playing" }, 
{ timestamp: 1399621674306, data: "buffering" }, 
{ timestamp: 1399621674540, data: "playing" }, 
]

我想找到每个缓冲期的时间戳(开始和结束),即根据上述数据,我需要找到:

from 1399621649624 to 1399621649912
from 1399621655253 to 1399621655536
from 1399621662404 to 1399621662745
from 1399621674306 to 1399621674540

我有这个代码,它工作正常,但是有更直接的(rubyesque,甚至?)方式来做到这一点?我想这可以归结为从一种类型开始并找到与前一种类型不同的元素(因为可以有后续的"播放"状态)。

state = nil
buffer_start_time = nil
buffer_end_time = nil

events.each do |event|
  if event[:data] == "buffering"
    if state == "playing" or state.nil? # if we didn't buffer already
        buffer_start_time = event[:timestamp]
    end
    state = "buffering"
    next
  end

  if state == "buffering" and event[:data] == "playing"
    state = "playing"
    buffer_end_time = event[:timestamp]
    puts "Buffering from #{buffer_start_time} to #{buffer_end_time}"
  end
end

请注意,也可以buffering - buffering - playing - playing,在这种情况下,我当然对第一个"缓冲"感兴趣?事件和第一次"播放"事件。我的代码通过测试状态是否正在播放来捕获 - 它只会更新时间戳。

3 个答案:

答案 0 :(得分:3)

这应该适用于所有描述的情况,尽管在最后一个事件是buffering事件时未定义您想要的输出。显然它还没有结束时间,因此我们无法输出from X to Y。但是,最后一个事件(或buffering个事件序列中的第一个buffering事件)的值将是.reduce函数的结果,因此您可以捕获该值,如果它是一个缓冲事件,按你的意愿处理它。

基本上,这是一个简单的方法:

  1. buffering过渡到playing =>输出from X to Y

  2. 将以前的事件更新为当前事件,除非两者都是buffering个事件。在那种情况下,保留第一个buffering事件,因为那是缓冲时间的起点,我们稍后需要它。


  3. events.reduce do |prev, cur|
      if prev[:data] == 'buffering' and cur[:data] == 'playing'
        puts 'from %d to %d' % [prev[:timestamp], cur[:timestamp]]
      end
    
      # With 2+ buffers in a row, keep the starting buffer, we need its timestamp as start
      (prev[:data] == 'buffering' and cur[:data] == 'buffering') ? prev : cur
    end
    
    
    # Output 
    # from 1399621649624 to 1399621649912
    # from 1399621655253 to 1399621655536
    # from 1399621662404 to 1399621662745
    # from 1399621674306 to 1399621674540
    

答案 1 :(得分:2)

events.chunk { |event| event[:data] }.each_cons(2).select do |(type, _), _|
  type == 'buffering'
end.each do |(_, (buffer, _)), (_, (other, _))|
  puts "Buffering from #{buffer[:timestamp]} to #{other[:timestamp]}"
end

此代码的作用:

  1. chunk按照:data顺序对项目进行分组,因此您可以获得以下内容:

    [["buffering", [{ timestamp: 1399621649624, data: "buffering" }]], 
     ["playing", [{ timestamp: 1399621649912, data: "playing" }]], 
     ["buffering", [{ timestamp: 1399621655253, data: "buffering" }]], 
     ["playing", [{ timestamp: 1399621655536, data: "playing" }, 
                  { timestamp: 1399621661537, data: "playing" }]], 
      ...
    ]
    
  2. each_cons(2)获取结果数组中的每两个元素
  3. select { |(type, _), _|仅选择第一个元素为'buffering'
  4. 的对
  5. each接受第一个缓冲事件和第一个其他事件,并打印时间。

答案 2 :(得分:1)

以下是使用Enumerable#chunk执行此操作的另一种方法。

<强>代码

def pair_em(events)
  return [] if events.empty?
  chunks = events.chunk { |e| e[:data] == 'b' }.to_a
  chunks.shift unless chunks.first.first # drop any leading f's
  return [] if chunks.empty?
  chunks.pop if chunks.last.first         # drop any trailing b's
  chunks.each_slice(2).map { |(_,b),(_,p)| [b.last[:ts],p.first[:ts]] }
end

<强>实施例

events = [
  { ts:  0, data: "p" },
  { ts:  1, data: "p" },
  { ts:  2, data: "b" },
  { ts:  3, data: "b" },
  { ts:  4, data: "p" },
  { ts:  5, data: "b" },
  { ts:  6, data: "p" },
  { ts:  7, data: "p" },
  { ts:  8, data: "b" },
  { ts:  9, data: "p" },
  { ts: 10, data: "b" }
]

pair_em(events)                 #=> [[3, 4], [5, 6], [8, 9]]
pair_em([])                     #=> []
pair_em([{ ts: 3, data: "b" }]) #=> []

<强>解释

假设数组events如上所述

chunks = events.chunk { |e| e[:data] == 'b' }.to_a
  #=> [[false, [{:ts=>0, :data=>"p"}, {:ts=>1, :data=>"p"}]],
  #    [true,  [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]],
  #    [false, [{:ts=>4, :data=>"p"}]],
  #    [true,  [{:ts=>5, :data=>"b"}]],
  #    [false, [{:ts=>6, :data=>"p"}, {:ts=>7, :data=>"p"}]],
  #    [true,  [{:ts=>8, :data=>"b"}]],
  #    [false, [{:ts=>9, :data=>"p"}]],
  #    [true, [{:ts=>10, :data=>"b"}]]]

删除chunks的第一个元素,如果它与p对应:

chunks.shift unless chunks.first.first
  #=> [false, [{:ts=>0, :data=>"p"}, {:ts=>1, :data=>"p"}]]
chunks
  #=> [[true,  [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]],
  #    [false, [{:ts=>4, :data=>"p"}]],
  #    [true,  [{:ts=>5, :data=>"b"}]],
  #    [false, [{:ts=>6, :data=>"p"}, {:ts=>7, :data=>"p"}]],
  #    [true,  [{:ts=>8, :data=>"b"}]],
  #    [false, [{:ts=>9, :data=>"p"}]],
  #    [true, [{:ts=>10, :data=>"b"}]]]

chunks不为空,请继续:

return [] if chunks.empty?

删除chunks的最后一个元素,如果它与f对应:

chunks.pop if chunks.last.first
  #=> [true, [{:ts=>10, :data=>"b"}]]
chunks
  #=> [[true,  [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]],
  #    [false, [{:ts=>4, :data=>"p"}]],
  #    [true,  [{:ts=>5, :data=>"b"}]],
  #    [false, [{:ts=>6, :data=>"p"}, {:ts=>7, :data=>"p"}]],
  #    [true,  [{:ts=>8, :data=>"b"}]],
  #    [false, [{:ts=>9, :data=>"p"}]]]

enum = chunks.each_slice(2)
  #=> #<Enumerator: ... :each_slice(2)>

enum转换为数组以查看它将传递给其块的值:

enum.to_a
  #=> [[[true,  [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]],
  #     [false, [{:ts=>4, :data=>"p"}]]],
  #    [[true,  [{:ts=>5, :data=>"b"}]],
  #     [false, [{:ts=>6, :data=>"p"}, {:ts=>7, :data=>"p"}]]],
  #    [[true, [{:ts=>8, :data=>"b"}]],
  #     [false, [{:ts=>9, :data=>"p"}]]]]

enum的元素映射到所需的值:

enum.map { |(_,b),(_,p)| [b.last[:ts],p.first[:ts]] }
  #=> [[3, 4], [5, 6], [8, 9]]

第一个值enum传递给它的块,

[[true,  [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]],
 [false, [{:ts=>4, :data=>"p"}]]]

将以下值分配给块变量:

_ => true (had a variable be used instead of the placeholder)
b => [{:ts=>2, :data=>"b"}, {:ts=>3, :data=>"b"}]
_ => false (had a variable be used instead of the placeholder)
f => [{:ts=>4, :data=>"p"}]

传递给块的第一个值enum因此映射到:

[b.last[:ts],p.first[:ts]] #=> [3, 4]