SQL Server 2k8全文搜索“无关”表使用视图或?

时间:2012-05-17 11:36:06

标签: sql-server-2008 full-text-search full-text-catalog

我对全文搜索很陌生,我真的很想知道在多个不相关的表中执行“网站搜索”式全文搜索的最佳方法(我打算在4个表中执行此操作)。我正在考虑使用这样的视图:

CREATE VIEW [dbo].[Search] WITH SCHEMABINDING
    AS

        SELECT   p.ProductId AS ItemId
            ,'Product' AS ItemType
            ,p.Title AS ItemTitle
            ,p.LongDescription AS LongDescription
            ,p.Price AS Price
    FROM dbo.Product AS p
    WHERE p.IsActive = 1

    UNION

    SELECT   a.ArticleId AS ItemId
            ,'Article' AS ItemType
            ,a.ArticleTitle AS ItemTitle
            ,a.Contents AS LongDescription
            ,NULL AS Price
    FROM dbo.Article AS a
    WHERE a.IsActive = 1

但是在研究索引的正确语法时,我意识到“a”我需要一个唯一的索引,“b”显然使用Unions的视图不能用于创建全文索引......

我看到的另一种方法是为每个表创建一个FTI,然后在Stored Proc中,将它们UNION放入一个tmp表中,然后选择带有Order By rank的tmp表。

我真的很感激这方面的一些指导,我发现的大部分内容与多个相关表有关,其中加入视图就足以克服这个问题了。

编辑:

@Joe很友好地回答了这个问题,我已经忘记并且实际上已经解决了这个问题,但是担心它有点长篇大论,似乎这可能是他建议的两种方式中最合乎逻辑的,这里是我正在使用的东西 - 我完全忘记了我必须将它分页给...我不认为客户会对无穷无尽的结果列表感到激动......

我的一位同事还提出了另一种他已经实现的技术,即在表格中抛出元数据,并基本将结果缓存到另一个表中,然后进行全文搜索,如果你这样做,这不是一个很糟糕的方法知道您的元数据将会是什么,另外您还需要将其键入原始表格以立即获得实际结果或显示(如果需要可以发表完整文章)

    CREATE PROCEDURE [dbo].[up_Search]
     @Term VARCHAR(100)
    ,@Skip INT = 0
    ,@Take INT = 10
AS
DECLARE @Search TABLE
(
     ItemId INT
    ,ItemType VARCHAR(50)
    ,ItemTitle VARCHAR(100)
    ,LongDescription VARCHAR(MAX)
    ,Price DECIMAL(10,2)
    ,SearchRank INT
)

INSERT INTO @Search SELECT * FROM (

    SELECT   p.ProductId AS ItemId
            ,'Product' AS ItemType
            ,p.Title AS ItemTitle
            ,p.LongDescription AS LongDescription
            ,p.Price AS Price
            ,KEY_TBL.RANK AS SearchRank
    FROM dbo.Product AS p
    INNER JOIN CONTAINSTABLE(dbo.Product, Title, @Term) AS KEY_TBL ON p.ProductId = KEY_TBL.[KEY]
    WHERE p.IsActive = 1

    UNION

    SELECT   a.ArticleId AS ItemId
            ,'Article' AS ItemType
            ,a.ArticleTitle AS ItemTitle
            ,a.Contents AS LongDescription
            ,NULL AS Price
            ,KEY_TBL.RANK AS SearchRank
    FROM dbo.Article AS a
    INNER JOIN CONTAINSTABLE(dbo.Article, ArticleTitle, @Term) AS KEY_TBL ON a.ArticleId = KEY_TBL.[KEY]
    WHERE a.IsActive = 1

    UNION    

    SELECT   n.NewsId AS ItemId
            ,'News' AS ItemType
            ,n.NewsTitle AS ItemTitle
            ,n.Contents AS LongDescription
            ,NULL AS Price
            ,KEY_TBL.RANK AS SearchRank
    FROM dbo.News AS n
    INNER JOIN CONTAINSTABLE(dbo.News, NewsTitle, @Term) AS KEY_TBL ON n.NewsId = KEY_TBL.[KEY]
    WHERE n.IsActive = 1

    UNION

    SELECT   b.BusinessId AS ItemId
            ,bt.Title AS ItemType
            ,b.Title AS ItemTitle
            ,b.LongDescription AS LongDescription
            ,NULL AS Price
            ,KEY_TBL.RANK AS SearchRank
    FROM dbo.Business AS b
    INNER JOIN CONTAINSTABLE(dbo.Business, Title, @Term) AS KEY_TBL ON b.BusinessId = KEY_TBL.[KEY]
    INNER JOIN dbo.BusinessType AS bt ON b.BusinessTypeId = bt.BusinessTypeId
    WHERE b.IsActive = 1
) AS tmp;

WITH SearchCT AS
(
    SELECT   ItemId
            ,ItemType
            ,ItemTitle
            ,LongDescription
            ,Price
            ,SearchRank
            ,ROW_NUMBER() OVER (ORDER BY SearchRank DESC) AS RowNumber
            ,COUNT(*) OVER () AS RecordCount
    FROM @Search
)
SELECT ItemId, ItemType, ItemTitle, LongDescription, SearchRank, RowNumber, RecordCount
FROM SearchCT
WHERE RowNumber BETWEEN @Skip + 1 AND (@Skip + @Take)
ORDER BY RowNumber

RETURN 0

1 个答案:

答案 0 :(得分:2)

我认为你可以采取两种基本方法:

1)将四个表聚合到一个表中并搜索该表。您需要在此表上具有主键的唯一标识符。因此,表结构将类似于您正在考虑的索引视图,并且看起来像这样:

CREATE TABLE AggregatedTable
(
    Id int IDENTITY(1,1) primary key,
    ItemId int,
    ItemType nvarchar(50),
    ItemTitle nvarchar(255),
    LongDescription nvarchar(max),
    IsActive int
)

然后,您需要在LongDescription列上创建全文索引。

这种方法的优点是您可以在单个查询中对单个表进行全文搜索,例如:

SELECT Id, ItemId, ItemType, ct.RANK      
    FROM dbo.AggregateTable AS a INNER JOIN 
    CONTAINSTABLE (AggregateTable , *, '(light NEAR aluminum)',   1033) AS ct
        ON a.ItemId= ct.[KEY]
WHERE IsActive = 1
ORDER BY ct.RANK desc

这种方法的缺点是: 1.您必须定期运行作业以将4个基表中的数据加载到聚合表中 2.您将使用两倍的磁盘空间

第二种方法是将数据保存在四个单独的表中,然后写入UNTS从四个表中得到结果的FTS查询。您应该能够按相关性对结果进行排名,然后获得前N个最相关的结果。您必须按如下方式编写查询:

SELECT   p.ProductId AS ItemId, 'Product' AS ItemType, ct.RANK 'Rank'       
    FROM dbo.Product AS p INNER JOIN 
    CONTAINSTABLE (Product, *, '(light NEAR aluminum)',   1033) AS ct
        ON p.ProductId = ct.[KEY]
WHERE p.IsActive = 1
UNION
SELECT   a.ArticleId AS ItemId, 'Article' AS ItemType, ct.RANK  
      CONTAINSTABLE (Article, *, '(light NEAR aluminum)',   1033) AS ct
        ON p.ProductId = ct.[KEY]  
    FROM dbo.Article AS a
    WHERE a.IsActive = 1
    ORDER BY 'Rank' DESC
UNION ... other two tables

这种方法的优点是您不需要将四个表中的内容聚合到一个表中的作业。

缺点是您的查询更复杂,因为它们需要来自四个查询的UNION结果。

我倾向于第二种方法。我认为这更直接,更容易维护,UNION查询可以直接构建。