使用SQL的术语匹配

时间:2018-06-25 05:31:24

标签: sql sql-server text-processing

我想在数据库中拆分文本,然后查看我搜索的所有术语是否都在文本中。

例如,“这是一只猫”是数据库中的文本。如果我搜索“猫”或“是猫”,它应该返回数据,但是如果我搜索“ ca”或“猫a”,则不应返回任何内容。

我可以在代码中执行此操作,但是我想知道是否可以在查询中执行此操作。

2 个答案:

答案 0 :(得分:0)

也许是这样?

DECLARE @text NVARCHAR(400) = 'this is a cat'  
DECLARE @search NVARCHAR(400) = 'a cat'  

SELECT t.value  
FROM STRING_SPLIT(@text, ' ') t
join STRING_SPLIT(@search, ' ') s
on t.value = s.value
WHERE RTRIM(t.value) <> '' and RTRIM(s.value) <> '';

然后,您可以比较此结果的计数与搜索拆分计数是否相等

您需要Sql-server 2016或更高版本才能运行STRING_SPLIT

对于较低版本的sql,您需要创建一个UDF来拆分字符串。这样的一个:

CREATE FUNCTION dbo.fnSplit(
    @sInputList VARCHAR(8000) -- List of delimited items
  , @sDelimiter VARCHAR(8000) = ',' -- delimiter that separates items
) RETURNS @List TABLE (item VARCHAR(8000))

BEGIN
DECLARE @sItem VARCHAR(8000)
WHILE CHARINDEX(@sDelimiter,@sInputList,0) <> 0
 BEGIN
 SELECT
  @sItem=RTRIM(LTRIM(SUBSTRING(@sInputList,1,CHARINDEX(@sDelimiter,@sInputList,0)-1))),
  @sInputList=RTRIM(LTRIM(SUBSTRING(@sInputList,CHARINDEX(@sDelimiter,@sInputList,0)+LEN(@sDelimiter),LEN(@sInputList))))

 IF LEN(@sItem) > 0
  INSERT INTO @List SELECT @sItem
 END

IF LEN(@sInputList) > 0
 INSERT INTO @List SELECT @sInputList -- Put the last item in
RETURN
END
GO

上面的查询变为:

DECLARE @text NVARCHAR(400) = 'this is a cat'  
DECLARE @search NVARCHAR(400) = 'a cat'  

SELECT t.Item  
FROM fnSplit(@text, ' ') t
join fnSplit(@search, ' ') s
on t.Item = s.Item

答案 1 :(得分:0)

您可以尝试下面的代码,希望它会对您有所帮助。

Declare @InputString Varchar(50) = 'a cat'  --<-- String comming in

Declare @Table TABLE (Strings Varchar(50))  --<-- String in the Database
Insert Into @Table Values ('this is a cat')

-- Convert to XML Input sting
declare @xml xml = N'<root><r>' + replace(@InputString, ' ','</r><r>') +     '</r></root>';


WITH DBString AS (

-- Split string stored in the database

SELECT  RTRIM(LTRIM(Split.a.value('.', 'VARCHAR(100)'))) Strings 
FROM   
(SELECT Cast ('<X>' + Replace(Strings, ' ', '</X><X>') + '</X>' AS XML) AS Data
FROM    @Table
) AS t CROSS APPLY Data.nodes ('/X') AS Split(a) )
,InputStrings AS
(

-- Split String coming in the parameter

select RTRIM(LTRIM(r.value('.','varchar(max)'))) as InputString
from @xml.nodes('//root/r') as records(r)
 )

-- Finally Compare the splitted  strings word by word

select 1 from (
SELECT COUNT(1) t1, (SELECT count(1) 
          FROM InputStrings) t2
FROM InputStrings
WHERE EXISTS (SELECT 1 
          FROM DBString
          WHERE InputStrings.InputString = DBString.Strings)
         ) TBF where tbf.t1=TBF.t2