我有一个名为“ EntityName”和“ entityid”的列。
Entityid EntityName
1234 ABC inch EFG inch
3456 inch* aaa inch vvv
任何人都可以给我查询以找到这些重复单词的类型。
答案 0 :(得分:2)
如果您使用SQL Server 2017
,则可以使用STRING_SPLIT
尝试以下查询:
CREATE TABLE #TestData(Entityid int,Situation varchar(100))
INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv'),
(7890,'BBBB aaa inch vvv')
SELECT *
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)
DROP TABLE #TestData
您可以显示计数:
CREATE TABLE #TestData(Entityid int,Situation varchar(100))
INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv aaa aaa'),
(7890,'BBBB aaa inch vvv')
SELECT
*,
(
SELECT STRING_AGG(CONCAT(value,'*',cnt),', ')
FROM
(
SELECT value,COUNT(*) cnt FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1
) q
) DuplicatedWords
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)
DROP TABLE #TestData
结果:
Entityid Situation DuplicatedWords
1234 ABC inch EFG inch inch*2
3456 inch aaa inch vvv aaa aaa aaa*3, inch*2
答案 1 :(得分:2)
您可以尝试以下操作:
DECLARE @DataSource TABLE
(
[EntityID] INT
,[Situation] VARCHAR(MAX)
);
INSERT INTO @DataSource ([EntityID], [Situation])
VALUES (1234, 'ABC inch EFG inch')
,(3456, 'inch aaa inch vvv')
,(1, 'only one inch');
DECLARE @Search VARCHAR(12) = 'inch';
SELECT *
FROM @DataSource
WHERE CHARINDEX(@Search, [Situation]) > 0
AND CHARINDEX(@Search, STUFF([Situation], CHARINDEX(@Search, [Situation]), LEN(@Search), '')) > 0;
这个想法是要检查您的单词是否匹配,然后替换它并检查是否匹配。
当然,这是非常简单的匹配。如果实现SQL CLR函数以在T-SQL上下文中获得正则表达式支持,则可以添加更复杂的条件。