我在SQL Server数据库上使用全文搜索来返回多个表的结果。最简单的情况是搜索人名字段和描述字段。我用来做这个的代码如下:
select t.ProjectID as ProjectID, sum(t.rnk) as weightRank
from
(
select KEY_TBL.RANK * 1.0 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
INNER JOIN FREETEXTTABLE(Projects, Description, @SearchText) AS KEY_TBL
ON FT_TBL.ProjectID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (FirstName, LastName), @SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
)
group by ProjectID
正如上面(希望)清楚的那样,我试图在一个项目描述字段中对一个人的名字匹配比较重。如果我搜索像'john'这样的东西,那么所有与名为john的人的项目将被加权(如预期的那样)。我遇到的问题是有人提供像'john smith'这样的全名的搜索。在这种情况下,匹配在名称上的强度要小得多(我猜)在每个firstname
/ lastname
列中只有一半的搜索词匹配。在许多情况下,这意味着与输入名称完全匹配的人不一定会在搜索结果顶部附近返回。
我已经能够通过分别搜索每个firstname
/ lastname
字段并将它们的分数加在一起来更正此问题,因此我的新查询如下所示:
select t.ProjectID as ProjectID, sum(t.rnk) as weightRank
from
(
select KEY_TBL.RANK * 1.0 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
INNER JOIN FREETEXTTABLE(Projects, Description, @SearchText) AS KEY_TBL
ON FT_TBL.ProjectID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (FirstName), @SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (LastName), @SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
)
group by ProjectID
我的问题:
这是我应该采取的方法,还是有一些方法可以让全文搜索在列列表上运行,就好像它是一团文本:即对待firstname
和{{1 }列作为单个lastname
列,导致字符串的得分匹配更高,包括人名和姓氏?
答案 0 :(得分:2)
我最近遇到过这种情况,并使用计算列将所需的列连接成一个字符串,然后在该列上显示全文索引。
我通过复制计算列中的加权字段来实现加权。
即。姓氏出现3次,名字出现一次。
ALTER TABLE dbo.person ADD
PrimarySearchColumn AS
COALESCE(NULLIF(forename,'') + ' ' + forename + ' ', '') +
COALESCE(NULLIF(surname,'') + ' ' + surname + ' ' + surname + ' ', '') PERSISTED
您必须确保使用persisted关键字,以便在每次读取时都不计算列。
答案 1 :(得分:0)
我知道这是一个古老的问题,但是我遇到了相同的问题,并以不同的方式解决了这个问题。
我创建了包含合并字段的索引视图,而不是将计算列添加到原始表(可能并不总是这样)。使用原始示例:
CREATE VIEW [dbo].[v_PeopleFullName]
WITH SCHEMABINDING
AS SELECT dbo.People.PersonID, ISNULL(dbo.People.FirstName + ' ', '') + dbo.People.LastName AS FullName
FROM dbo.People
GO
CREATE UNIQUE CLUSTERED INDEX UQ_v_PeopleFullName
ON dbo.[v_PeopleFullName] ([PersonID])
GO
然后,我将该查询与基表中各个列上的现有全文谓词一起加入查询中,以便可以在各个列中找到完全匹配和部分匹配,如下所示:
DECLARE @SearchText NVARCHAR(100) = ' "' + @OriginalSearchText + '" ' --For matching exact phrase
DECLARE @SearchTextWords NVARCHAR(100) = ' "' + REPLACE(@OriginalSearchText, ' ', '" OR "') + '" ' --For matching on words in phrase
SELECT FT_TBL.ProjectID as ProjectID,
ISNULL(KEY_TBL.[Rank], 0) + ISNULL(KEY_VIEW.[Rank], 0) AS [Rank]
FROM Projects as FT_TBL
INNER JOIN People as p on FT_TBL.PersonID = p.PersonID
LEFT OUTER JOIN CONTAINSTABLE(People, (FirstName, LastName), @SearchTextWords) AS KEY_TBL ON p.PersonID = KEY_TBL.[KEY] INNER JOIN
LEFT OUTER JOIN CONTAINSTABLE(v_PeopleFullName, FullName, @SearchText) AS KEY_VIEW ON p.PersonID = KEY_VIEW.[Key]
WHERE ISNULL(KEY_TBL.[Rank], 0) + ISNULL(KEY_VIEW.[Rank], 0) > 0
ORDER BY [Rank] DESC
一些注意事项:
CONTAINSTABLE
而不是FREETEXTTABLE
,因为它似乎更适合我搜索姓名。当我要搜索的名称时,我对查找具有相似含义或词首字母变化的词不感兴趣。CONTAINSTABLE
,所以必须对@SearchText
变量进行一些预处理,以使其兼容并使用OR
将其分解为单个单词用于在基表的全文本索引上进行搜索的运算符。UNION
查询来联接单独的查询,而是使用单个联接的CONTAINSTABLE
来联接同一查询中的两个CONTAINSTABLE
谓词。这意味着使用外部联接而不是内部联接,因此我在使用WHERE
子句从基表中排除在任何一个全文索引上都不匹配的任何记录。我承认,与单独使用全文索引谓词UNION
生成单个结果集的单独查询相比,我没有对它的执行情况进行过任何检查。