Question

我正在寻找一个ruby集合，它支持类似于任何数据库字符串索引的字符串上的left-gt; right索引。目的是通过该字符串的前缀快速检索字符串。我知道这可以通过手工使用树来完成，但我正在寻找内置的ruby方法......

例如，给定一个包含单词“tomato”的集合，“tom”搜索将生成该单词而无需对集合进行全面扫描。

Answer 1

嗯，有abbrev：

require 'abbrev'
wordlist = [
"smooth", "snail", "sneak", "snooze", "snore", "snow", "snowball",
"snowflake", "snowman", "soak", "soap", "sofa", "soil", "someone", "somewhere"
].abbrev

导致哈希：

{"smoot"=>"smooth", "smoo"=>"smooth", "smo"=>"smooth", "sm"=>"smooth",
"snai"=>"snail", "sna"=>"snail", "snea"=>"sneak", "sne"=>"sneak",
"snooz"=>"snooze", "snoo"=>"snooze", "snor"=>"snore", "snowbal"=>"snowball",
"snowba"=>"snowball", "snowb"=>"snowball", "snowflak"=>"snowflake",
"snowfla"=>"snowflake", "snowfl"=>"snowflake", "snowf"=>"snowflake",
"snowma"=>"snowman", "snowm"=>"snowman", "sof"=>"sofa", "soi"=>"soil",
"someon"=>"someone", "someo"=>"someone", "somewher"=>"somewhere",
"somewhe"=>"somewhere", "somewh"=>"somewhere", "somew"=>"somewhere",
"smooth"=>"smooth", "snail"=>"snail", "sneak"=>"sneak", "snooze"=>"snooze",
"snore"=>"snore", "snow"=>"snow", "snowball"=>"snowball", "snowflake"=>"snowflake",
"snowman"=>"snowman", "soak"=>"soak", "soap"=>"soap", "sofa"=>"sofa", 
"soil"=>"soil", "someone"=>"someone", "somewhere"=>"somewhere"}

Answer 2

如何对预先排序的列表进行排序，以便有效地实现比较的起点，然后对几个单词进行正则表达式匹配？

# Using steenslag's list
$list = %w[
    smooth snail sneak snooze snore snow snowball
    snowflake snowman soak soap sofa soil someone somewhere
].sort!

def left_match str
    return [] unless i = $list.index{|w| str <= w}
    matches = []
    re = /\A#{str}/
    while w = $list[i] and w =~ re
        matches.push(w)
        i += 1
    end
    matches
end

这个例子：

p left_match("snow")

将返回

["snow", "snowball", "snowflake", "snowman"]

此处，index用于从排序列表中查找"snow"，并且仅尝试五次正则表匹配（四次成功，一次失败），这应该不会加载太多。使用正则表达式的匹配不受列表大小的影响。

Ruby left-＆gt;右索引字符串集合

2 个答案: