Question

我有一个哈希数组，其中键为Date，值为Integer。这是一个模拟它的测试代码。

hashes = 2000.times.map do |i|
  [Date.new(2017) - i.days, rand(100)]
end.to_h

我想获得特定时期的价值。起初我用Range#include?写了，但它很慢。

Benchmark.measure do
  hashes.select{|k,v| (Date.new(2012,3,3)..Date.new(2012,6,10)).include?(k)}
end

#<Benchmark::Tms:0x007fd16479bed0 @label="", @real=2.9242447479628026, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=2.920000000000016, @total=2.920000000000016>

使用简单的大于或小于运算符，它变得快60倍。

Benchmark.measure do
  hashes.select{|k,v| k >= Date.new(2012,3,3) && k <= Date.new(2012,6,10)}
end

#<Benchmark::Tms:0x007fd162b61670 @label="", @real=0.05436371313408017, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.05000000000001137, @total=0.05000000000001137>

我认为这两个表达基本相同。

为什么会有这么大的差异？

Answer 1

您需要使用Range#cover?代替Range#include?，并且只需计算一次范围，而不是measure的每个元素计算一次。 cover?将块变量k与范围的端点进行比较; include?（对于非数字对象，例如日期）将范围中的每个元素与块变量进行比较，直到找到匹配或结束没有匹配（类似于Array#include?）。

此外，您希望考虑hashes（散列）的每个元素的第一个也是唯一的键，因此如果该散列是h，则第一个键值对是{{1该对的关键是h.first。

h.first.first

就执行速度而言，这应与第二种方法几乎相同。

示例

require 'date'

Benchmark.measure do
  r = Date.new(2012,3,3)..Date.new(2012,6,10)
  hashes.select{|h| r.cover? h.first.first }
end

为什么范围＃包括？比运算符更大或更小的速度慢得多

1 个答案: