音译和模糊搜索,就像谷歌的建议一样

时间:2013-05-27 11:03:26

标签: search fuzzy-search transliteration magicsuggest google-suggest

我需要对字符的音译进行模糊搜索,例如:

我有一个ASP.NET应用程序数据库,它有一个包含西班牙语单词列表(200,000个条目)的表,我还有一个带有输入字段的页面。关键是我不懂西班牙语,我不知道如何用西班牙语拼写搜索词,但我知道它听起来如何。因此,在文本框中我输入搜索词,例如“漂亮”,但在录制错误 - “prekieso”中,我需要从数据库中获取正确的版本:“precioso”。

如何实施?换句话说,我需要类似Google建议的内容......

2 个答案:

答案 0 :(得分:0)

我认为您需要的是拼写检查功能,例如:http://www.codeproject.com/KB/string/netspell.aspx

谷歌般的功能虽然更先进,但实施起来并不容易: How does the Google "Did you mean?" Algorithm work?

希望这有帮助。

答案 1 :(得分:0)

存储过程/函数,算法计算距离Levenshtein:

USE [**dbname**]
GO
/****** Object:  UserDefinedFunction [dbo].[levenshtein]    Script Date: 05/27/2013 17:54:05 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[levenshtein](@left varchar(100), @right varchar(100)) 
   returns int
as
BEGIN
   DECLARE @difference int, @lenRight int, @lenLeft int, @leftIndex int, @rightIndex int, @left_char char(1), @right_char char(1), @compareLength int 
   SET @lenLeft = LEN(@left) 
   SET @lenRight = LEN(@right) 
   SET @difference = 0  
   If @lenLeft = 0 
   BEGIN
      SET @difference = @lenRight GOTO done 
   END
   If @lenRight = 0 
   BEGIN
      SET @difference = @lenLeft 
      GOTO done 
   END 
   GOTO comparison  

   comparison: 
   IF (@lenLeft >= @lenRight) 
      SET @compareLength = @lenLeft 
   Else
      SET @compareLength = @lenRight  
   SET @rightIndex = 1 
   SET @leftIndex = 1 
   WHILE @leftIndex <= @compareLength 
   BEGIN
      SET @left_char = substring(@left, @leftIndex, 1)
      SET @right_char = substring(@right, @rightIndex, 1)
      IF @left_char <> @right_char 
      BEGIN -- Would an insertion make them re-align? 
         IF(@left_char = substring(@right, @rightIndex+1, 1))    
            SET @rightIndex = @rightIndex + 1 
         -- Would an deletion make them re-align? 
         ELSE
            IF(substring(@left, @leftIndex+1, 1) = @right_char)
               SET @leftIndex = @leftIndex + 1
               SET @difference = @difference + 1 
      END
      SET @leftIndex = @leftIndex + 1 
      SET @rightIndex = @rightIndex + 1 
   END 
   GOTO done  

   done: 
      RETURN @difference 
END

<强>调用

select
 dbo.edit_distance('Fuzzy String Match','fuzzy string match'),
 dbo.edit_distance('fuzzy','fuzy'),
 dbo.edit_distance('Fuzzy String Match','fuzy string match'),
 dbo.edit_distance('levenshtein distance sql','levenshtein sql server'),
 dbo.edit_distance('distance','server')

SELECT [Name]
FR OM [tempdb].[dbo].[Names]
WHERE dbo.edit_distance([Name],'bozhestvennia') <= 3