具有“1-1”搜索的可持续性能

时间:2012-11-13 19:31:17

标签: sql sql-server full-text-search containstable

我正在使用包含containsstable函数的查询,其字符串搜索如下:“1-1”或类似(如“1 1”或“a a”) 问题是查询过长,并没有带来很多结果。 相反,相同的查询,但与其他搜索字符串,如“a”,检索更多的结果,花费更少的时间来完成。 这是查询:

SELECT     COUNT(d.DocumentID) 
FROM       KnowledgeIndex_Data.dbo.Document d
INNER JOIN CONTAINSTABLE ( KnowledgeIndex_Data.dbo.Document , * , '"1-1"' ) ftt 
        ON ( d.DocumentID = ftt.[Key] )

注意:全文索引的停用词列表不包含1

你知道会发生什么吗? 谢谢!

这是执行计划

Plan

这是表Document的创建脚本:

CREATE TABLE dbo.Document
(
      DocumentID int IDENTITY (1, 1) NOT NULL -- Local int for cross reference tables to save 12 bytes per record
    , DocumentGUID uniqueidentifier NOT NULL

--  , DocumentTypeID tinyint NOT NULL
    , DocumentSourceID smallint NOT NULL -- New Document Source identifier
    , SourceDocumentID nvarchar(80) NOT NULL --crb 2011/08/23 changed from nvarchar(40) to support PageCodes -- asw 2010/02/12 renamed to make purpose more clear

    , DocumentStructureID tinyint NOT NULL -- New Document Structure identifier

    , SortOrder nvarchar(450) NOT NULL -- 2010/06/18 bdw- Add the Sort Order column and index to the Document table

    , ResultDisplayContent xml (DOCUMENT DocumentResultDisplayContentSchemaCollection) NOT NULL  -- Required For All DocumentTypes -- jci 2011/02/22 DOCUMENT added -- jci 2010/07/02 xml schema added
    , DetailDisplayContent xml (DOCUMENT DocumentDetailDisplayContentSchemaCollection) NULL -- Only required for some DocumentTypes -- jci 2011/02/22 DOCUMENT added  -- jci 2011/0/31 xml schema added
    , TeaserDisplayContent xml (DOCUMENT DocumentResultDisplayContentSchemaCollection) NULL -- Teaser Result data. Optional, replaced with main ResultDisplayContent if null. -- jci 2011/02/22 DOCUMENT added -- jci 2010/07/02 xml schema added

, TitleQueryContent nvarchar(max) NOT NULL
, QueryContent nvarchar(max) NOT NULL

, CreatedAt datetimeoffset(2) NOT NULL

, CONSTRAINT pcDocument PRIMARY KEY CLUSTERED -- jci 2011/07/01 replaced -- CONSTRAINT pncDocument PRIMARY KEY NONCLUSTERED
    ( DocumentID ) WITH FILLFACTOR = 100
, CONSTRAINT fkDocumentDocumentSourceID FOREIGN KEY
    ( DocumentSourceID )
    REFERENCES dbo.DocumentSource ( DocumentSourceID )
    ON DELETE CASCADE
, CONSTRAINT fkDocumentDocumentStructureID FOREIGN KEY
    ( DocumentStructureID )
    REFERENCES dbo.DocumentStructure ( DocumentStructureID )
    ON DELETE CASCADE
)
GO

和索引:

-- Create Index On Table
CREATE FULLTEXT INDEX ON dbo.Document(QueryContent LANGUAGE N'English' , TitleQueryContent LANGUAGE N'English')
    KEY INDEX pcDocument -- 2011/07/01 replaced --pncDocument
    ON (FILEGROUP SECONDARY)
    WITH STOPLIST = SrsStopWordList -- Use SrsStopWordList
        , CHANGE_TRACKING = OFF , NO POPULATION; -- Update Manually For Performance

GO

2 个答案:

答案 0 :(得分:0)

在搜索字词上运行sys.dm_fts_parser会产生以下结果 -

select  *from sys.dm_fts_parser('"1-1"', 1033, 0 ,0)

display_term    expansion_type  source_term
1                0       1-1
nn1              0       1-1
1                0       1-1
nn1              0       1-1

因此,全文引擎最终会搜索4个不同的搜索词,然后组合结果。你能用display_term LIKE'1'或'nn1'在你的桌子上运行sys.dm_fts_index_keywords并分享结果吗?这可能有助于解释长期运行时间。

答案 1 :(得分:0)

我按如下方式进行了查询:

   SELECT count(*) FROM sys.dm_fts_index_keywords(db_id('KnowledgeIndex_Data'),              object_id('dbo.Document'))
   where display_term like '%1-1%'
   GO

它返回685053 '%nn1%'ir返回413578 '%engine%'返回2500

请注意,1不是我的全文索引的干扰词。 认为它可能与此有关吗?

用表格的一部分而不是全部表格来制作一个CONTAINSTABLE是否可行?

CONATINSTABLE搜索所有表dbo.Document,实际上在该查询之后我将WHERE应用于Document的一个字段,这使得CONTAINSTABLE做了不必要的工作。 谢谢!