Ruby快速和模糊搜索阵列的大量哈希

时间:2018-04-20 19:13:14

标签: arrays ruby sorting search hash

我有一系列像这样的哈希

@t = [{"id"=>"819827", "nm"=>"Razvilka", "countryCode"=>"RU"}, 
{"id"=>"524901", "nm"=>"Moscow", "countryCode"=>"RU"}, 
{"id"=>"1271881", "nm"=>"Firozpur Jhirka", "countryCode"=>"IN"}, 
{"id"=>"1283240", "nm"=>"Kathmandu", "countryCode"=>"NP"}] # ... + 100,000 more

我可以使用类似

的精确拼写搜索特定的哈希键
@t.find {|x| x["nm"] == "Moscow"}

它会很快返回哈希值。

但这不会考虑套管,语法或近似匹配。我怎么能这样做?

1 个答案:

答案 0 :(得分:2)

尝试levenshtein gem https://rubygems.org/gems/levenshtein

gem install levenshtein

然后在你的代码中:

require `levenshtein`

#Levenshtein.distance(a, b) < 5 # some fuzzy level

def find_levenshtein(hash, key, str)
  hash.select do |h|
    Levenshtein.distance(h[key], str) < 5
  end
end

puts find_levenshtein(t, 'nm', 'moscw').inspect
#=> [{"id"=>"524901", "nm"=>"Moscow", "lat"=>"55.752220", "lon"=>"37.615555", "countryCode"=>"RU"}]

有关详细信息,请参阅https://en.wikipedia.org/wiki/Levenshtein_distance