我正在寻找一个ruby集合,它支持类似于任何数据库字符串索引的字符串上的left-gt; right索引。目的是通过该字符串的前缀快速检索字符串。我知道这可以通过手工使用树来完成,但我正在寻找内置的ruby方法......
例如,给定一个包含单词“tomato”的集合,“tom”搜索将生成该单词而无需对集合进行全面扫描。
答案 0 :(得分:1)
嗯,有abbrev
:
require 'abbrev'
wordlist = [
"smooth", "snail", "sneak", "snooze", "snore", "snow", "snowball",
"snowflake", "snowman", "soak", "soap", "sofa", "soil", "someone", "somewhere"
].abbrev
导致哈希:
{"smoot"=>"smooth", "smoo"=>"smooth", "smo"=>"smooth", "sm"=>"smooth",
"snai"=>"snail", "sna"=>"snail", "snea"=>"sneak", "sne"=>"sneak",
"snooz"=>"snooze", "snoo"=>"snooze", "snor"=>"snore", "snowbal"=>"snowball",
"snowba"=>"snowball", "snowb"=>"snowball", "snowflak"=>"snowflake",
"snowfla"=>"snowflake", "snowfl"=>"snowflake", "snowf"=>"snowflake",
"snowma"=>"snowman", "snowm"=>"snowman", "sof"=>"sofa", "soi"=>"soil",
"someon"=>"someone", "someo"=>"someone", "somewher"=>"somewhere",
"somewhe"=>"somewhere", "somewh"=>"somewhere", "somew"=>"somewhere",
"smooth"=>"smooth", "snail"=>"snail", "sneak"=>"sneak", "snooze"=>"snooze",
"snore"=>"snore", "snow"=>"snow", "snowball"=>"snowball", "snowflake"=>"snowflake",
"snowman"=>"snowman", "soak"=>"soak", "soap"=>"soap", "sofa"=>"sofa",
"soil"=>"soil", "someone"=>"someone", "somewhere"=>"somewhere"}
答案 1 :(得分:0)
如何对预先排序的列表进行排序,以便有效地实现比较的起点,然后对几个单词进行正则表达式匹配?
# Using steenslag's list
$list = %w[
smooth snail sneak snooze snore snow snowball
snowflake snowman soak soap sofa soil someone somewhere
].sort!
def left_match str
return [] unless i = $list.index{|w| str <= w}
matches = []
re = /\A#{str}/
while w = $list[i] and w =~ re
matches.push(w)
i += 1
end
matches
end
这个例子:
p left_match("snow")
将返回
["snow", "snowball", "snowflake", "snowman"]
此处,index
用于从排序列表中查找"snow"
,并且仅尝试五次正则表匹配(四次成功,一次失败),这应该不会加载太多。使用正则表达式的匹配不受列表大小的影响。