我有两个巨大的句子数组,一个用德语,一个用英语。我将在德语句子中搜索包含某个单词的句子,如果有,我会检查是否有相同的英语句子(使用带有连接信息的哈希)。但是,如果用户正在寻找一个非常常见的单词,我不想返回包含它的每个句子,但只返回前x个匹配并停止搜索。
如果我german_sentences.index { |sentence| sentence.include?(word) }
,我一次只能获得一场比赛。
如果我使用german_sentences.keep_if { |sentence| sentence.include?(word) }
我得到所有匹配,但也会丢失索引信息,这对此非常重要。
我现在正在使用带有each_with_index的自定义循环,并在达到最大值后中断,但我真的觉得我必须缺少一些现有的解决方案,至少可以提供有限数量的匹配(即使不是他们的索引) )...
答案 0 :(得分:4)
german_sentences
.each_index
.lazy
.select{|i| german_sentences[i].include?(word)}
.first(n)
答案 1 :(得分:1)
如果您的需求不是一次性的,您可以使用Module#refine,而不是monkeypatching Array
)。 refine
已通过实验添加到v2.0,然后在第2.1节中进行了相当大的更改。精炼使用的一个限制是:"You may only activate refinements at top-level...",显然会阻止在Pry和IRB中进行测试。
module M
refine Array do
def select_indices_first(n)
i = 0
k = 0
a = []
return a if n == 0
each { |x| (a << i; k += 1) if yield(x); break if k == n; i += 1 }
a
end
def select_first(n) # if you wanted this also...
k = 0
a = []
return a if n == 0
each { |x| (a << x; k += 1) if yield(x); break if k == n }
a
end
end
end
using M
sentences = ["How now brown", "Cat", "How to guide", "How to shop"]
sentences.select_indices_first(0) {|s| s.include?("How")} # => []
sentences.select_indices_first(1) {|s| s.include?("How")} # => [0]
sentences.select_indices_first(2) {|s| s.include?("How")} # => [0, 2]
sentences.select_indices_first(3) {|s| s.include?("How")} # => [0, 2, 3]
sentences.select_indices_first(99) {|s| s.include?("How")} # => [0, 2, 3]
sentences.select_first(2) {|s| s.include?("How")}
# => ["How now brown", "How to guide"]