在ruby中计算数组的第二次滚动平均值的最快方法是什么?
我有两个骑自行车的数据阵列。时间是在骑行期间读取相应的速度值的时间。你会注意到读数并非每秒都有。出于这个原因,我不相信我可以将滚动数组增加一个。
speed = [0, 15, 17, 19, 18, 22, 24, 28, 22, 17, 16, 14, 15, 15, 15, 0, 15, 19, 21, 25, 26, 24, 24]
time = [0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18, 20, 21, 22, 23, 25, 26, 27]
我尝试了类似下面的内容(计算滚动5秒平均值并将其放入数组中),但对于大型阵列和多个间隔来说它相当慢(需要8分钟来计算1小时自行车骑行的所有间隔,1..3600):
duration = time.max
interval_average = []
time_hash = Hash[time.map.with_index.to_a]
roll_start = 0
roll_stop = 5
for i in 1..(duration+1) do
start = time_hash[roll_start]
stop = time_hash[roll_stop]
rolling_array = speed[start..stop]
avg_value = mean(rolling_array)
interval_average.push(avg_value)
roll_start += 1
roll_stop += 1
end
在我自己的代码中,我忽略了异常并推动0代替,因为我真的只是想找到最后x次平均值的最大值。
答案 0 :(得分:0)
我不确定它的性能,但是这是另一种方法,你可以测试在一段固定的时间内找到滚动平均值的最大值。
speed = [0, 15, 17, 19, 18, 22, 24, 28, 22, 17, 16, 14, 15, 15, 15, 0, 15, 19, 21, 25, 26, 24, 24]
time = [0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18, 20, 21, 22, 23, 25, 26, 27]
interval_length = 5 # seconds
speed.zip(time) # 1
.each_cons(interval_length) # 2
.select { |i| i.last.last - i.first.last == interval_length} # 3
.map { |i| i.map(&:first).reduce(:+) / interval_length.to_f } # 4
.max # 5
将其分解为具有中间结果的组件:
1)将每个速度读数与拍摄时间配对。
# => [[0, 0], [15, 1], [17, 2], [19, 3], [18, 5], [22, 6], [24, 7], [28, 8], [22, 10], [17, 11], [16, 12], [14, 13], [15, 15], [15, 16], [15, 17], [0, 18], [15, 20], [19, 21], [21, 22], [25, 23], [26, 25], [24, 26], [24, 27]]
2)将上述内容分成interval_length
的连续运行,在本例中为5.这将为您提供Enumerator
个对象,但使用to_a
我们可以看到中间结果看起来像这样:
# => [[15, 1], [17, 2], [19, 3], [18, 5], [22, 6]], [[17, 2], [19, 3], [18, 5], [22, 6], [24, 7]], [[19, 3], [18, 5], [22, 6], [24, 7], [28, 8]], [[18, 5], [22, 6], [24, 7], [28, 8], [22, 10]], [[22, 6], [24, 7], [28, 8], [22, 10], [17, 11]], [[24, 7], [28, 8], [22, 10], [17, 11], [16, 12]], [[28, 8], [22, 10], [17, 11], [16, 12], [14, 13]], [[22, 10], [17, 11], [16, 12], [14, 13], [15, 15]], [[17, 11], [16, 12], [14, 13], [15, 15], [15, 16]], [[16, 12], [14, 13], [15, 15], [15, 16], [15, 17]], [[14, 13], [15, 15], [15, 16], [15, 17], [0, 18]], [[15, 15], [15, 16], [15, 17], [0, 18], [15, 20]], [[15, 16], [15, 17], [0, 18], [15, 20], [19, 21]], [[15, 17], [0, 18], [15, 20], [19, 21], [21, 22]], [[0, 18], [15, 20], [19, 21], [21, 22], [25, 23]], [[15, 20], [19, 21], [21, 22], [25, 23], [26, 25]], [[19, 21], [21, 22], [25, 23], [26, 25], [24, 26]], [[21, 22], [25, 23], [26, 25], [24, 26], [24, 27]
3)由于您没有每秒钟的信息,因此每个速度值中的一些可能超过时间间隔,这些时间间隔实际上不是interval_length
秒。所以,我们只限制那些计算。 5秒钟,发生的情况是不需要丢弃数据,中间结果与步骤2相同。
4)现在我们可以采用滚动平均值:
# => [13.8, 18.2, 20.0, 22.2, 22.8, 22.6, 21.4, 19.4, 16.8, 15.4, 15.0, 11.8, 12.0, 12.8, 14.0, 16.0, 21.2, 23.0, 24.0]
5)最大值:
# => 24.0
同样,我不确定这对于一个非常大的数据集会如何,但它可能值得一试。