如何根据where子句匹配实现排名(没有全文索引)

时间:2012-01-30 20:00:26

标签: sql-server ranking

我有一个使用NHibernate标准查询机制动态编译的搜索查询。生成的SQL查询可能如下所示:

select 
    *
from
    sometable
where
(
    (
        firstname like 'chris%' or
        lastname like 'chris%'
    )
    and
    (
        firstname like 'vann%' or
        lastname like 'vann%'
    )    
)

表格中的数据可能如下所示:

FirstName         LastName
------------------------------
Chris             Smith
John              Vann
Chris             Vann

我想订购结果,使得匹配where子句中的两个子子句的行(即firstname = Chris和lastname = Vann)的排名高于仅匹配其中一个子子句的行。这在标准SQL中是否可行?

编辑:我大大简化了问题,以深入了解问题的核心。

2 个答案:

答案 0 :(得分:1)

这只是一个重要的开始。您可以创建计算priority列并按此列对行进行排序。该列是井匹配行的指示符。这里有一个示例代码:

create table #t (f varchar(10), l varchar(10) );

insert into #t values ('aa','ee'),('aa','ii'),('oo','ee');

select 
   *,
   case when f like 'aa%' then 1 else 0 end +
   case when l like 'aa%' then 1 else 0 end +
   case when f like 'ii%' then 1 else 0 end + 
   case when l like 'aa%' then 1 else 0 end 
   as priority
from #t
order by 
   priority desc

结果:

f  l  priority 
-- -- -------- 
aa ee 4        
aa ii 4        
oo ee 0 

对于您的架构可能类似于:

select 
    *.
    case when firstname like 'chris%' and lastname like 'vann%' then 4 else 0 +
    case when firstname like 'chris%' and lastname not like 'vann%' then 3 else 0 +
    case when firstname not like 'chris%' and lastname like 'vann%' then 3 else 0 +
    ...
    as priority
from
    sometable
where
(
    (
        firstname like 'chris%' or
        lastname like 'chris%'
    )
    and
    (
        firstname like 'vann%' or
        lastname like 'vann%'
    )    
)
order by priority desc

答案 1 :(得分:0)

这是我鼓吹的T-SQL排名,看起来效果不错。

  • 使用差值功能对每个搜索对(姓,名,姓,名,姓,名,姓)进行排序,然后在名字或姓氏中搜索词的子字符串匹配时增加权重-第一个和最后一个比赛加权重。
  • 按具有完全匹配的子字符串的顺序排序,然后按最早的匹配顺序排序,然后使搜索字符串的长度与名字/姓氏的长度之间的差最小。
  • TotalRank中的加权因子(* 2,* 4)是任意的,只反映了我希望对更重的比赛(第一至最后)进行加权的愿望。
  • 下面的SQL有很多额外的列,它们说明了TotalRank列中的组件。您显然可以删除它们。

`

DECLARE @searchFirst varchar(max) = 'chris';

DECLARE @searchLast varchar(max) = 'vann';

SELECT firstname, lastname,

SOUNDEX(@searchFirst) as FSearchSoundEx,

SOUNDEX(firstname) as FSoundEx,

DIFFERENCE(firstname, @searchFirst) as FDiff,

LEN(firstName) - LEN(@searchFirst) as FFDelta,


SOUNDEX(lastname) as LSoundEx,

SOUNDEX(@searchLast) as LSearchSoundEx,

DIFFERENCE(lastName, @searchLast) as LDiff,

LEN(lastName) - LEN(@searchLast) as LLDelta,


PATINDEX('%' + @searchFirst + '%', firstname) as FFIndex,

PATINDEX('%' + @searchFirst + '%', lastname) as FLIndex,

PATINDEX('%' + @searchLast + '%', firstname) as LFIndex,

PATINDEX('%' + @searchLast + '%', lastname) as LLIndex,

CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', firstname)) as HasFF,

CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', lastname)) as HasFL, 

CONVERT(BIT, PATINDEX('%' + @searchLast + '%', firstname)) as HasLF,

CONVERT(BIT, PATINDEX('%' + @searchLast + '%', lastname)) as HasLL,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) as FFDiffSq, DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) as FLDiffSq, DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) as LFDiffSq, DIFFERENCE(lastname, @searchLast) * DIFFERENCE(lastname, @searchLast) as LLDiffSq,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) + DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) + DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) + Difference(lastname, @searchLast) * Difference(lastname, @searchLast) as SumDiffSquares,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) * 2 + DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) + DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) + DIFFERENCE(lastname, @searchLast) * DIFFERENCE(lastname, @searchLast) * 2
+ CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', firstname)) * 4 + CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', lastname)) + CONVERT(BIT, PATINDEX('%' + @searchLast + '%', firstname)) + CONVERT(BIT, PATINDEX('%' + @searchLast + '%', lastname)) * 4 as TotalRank

FROM Contacts

ORDER BY TotalRank Desc, HasLL Desc, HasFF Desc, HasFL Desc, HasLF Desc, LLIndex, FFIndex, FLIndex, LFIndex, LLDelta, FFDelta