如何比较字符串,使用T-SQL检查字符串是否包含相同的符号?
例如:
com.codename1.impl.android.LifecycleListener
vs 'aaabbcd'
( TRUE ):两个字符串都包含相同的符号'ddbca'
vs 'abcddd'
( FALSE ):两个字符串都不包含相同的符号答案 0 :(得分:3)
如果性能很重要,那么我会使用Ngrams8k建议一个基于集合的纯解决方案。
这会给你正确答案:
SELECT AllSame = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(@string1, 1) ng1
FULL JOIN dbo.ngrams8k(@string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL;
要对表使用此逻辑,您可以像这样使用CROSS APPLY:
-- Sample data
DECLARE @table TABLE (string1 varchar(100), string2 varchar(100));
INSERT @table VALUES ('aaabbcd','ddbca'),('abcddd','cda');
-- Solution using CROSS APPLY
SELECT *
FROM @table t
CROSS APPLY
(
SELECT AllSame = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x;
结果:
string1 string2 AllSame
--------- --------- --------
aaabbcd ddbca 1
abcddd cda 0
这不仅是目前为止提供的最快的解决方案,请注意我们可以用尽可能少的代码完成工作。
更新至包括马丁史密斯解决方案的比较表现
-- sample data
IF OBJECT_ID('tempdb..#sample') IS NOT NULL DROP TABLE #sample;
SELECT TOP (10000)
string1 = replicate('a',abs(checksum(newid())%5))+replicate('b',abs(checksum(newid())%4))+
replicate('c',abs(checksum(newid())%5))+replicate('d',abs(checksum(newid())%4))+
replicate('e',abs(checksum(newid())%5))+replicate('f',abs(checksum(newid())%4)),
string2 = replicate('a',abs(checksum(newid())%5))+replicate('b',abs(checksum(newid())%4))+
replicate('c',abs(checksum(newid())%5))+replicate('d',abs(checksum(newid())%4))+
replicate('e',abs(checksum(newid())%5))+replicate('f',abs(checksum(newid())%4))
INTO #sample
FROM sys.all_columns a, sys.all_columns b;
SET NOCOUNT ON;
SET STATISTICS TIME ON;
PRINT 'ajb serial'+char(10)+replicate('-',50);
SELECT flag
FROM #sample t
CROSS APPLY
(
SELECT Flag = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x
OPTION (MAXDOP 1);
PRINT 'ajb parallel'+char(10)+replicate('-',50);
SELECT flag
FROM #sample t
CROSS APPLY
(
SELECT Flag = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x
OPTION (querytraceon 8649);
PRINT 'M Smith - serial'+char(10)+replicate('-',50);
WITH Nums AS
(
SELECT TOP (100) ROW_NUMBER() OVER ( ORDER BY (SELECT NULL)) number
FROM sys.all_columns
)
SELECT flag
FROM #sample T
CROSS APPLY (SELECT CASE WHEN Min(Cnt) = 2 THEN 1 ELSE 0 END AS Flag
FROM (SELECT Count(*) AS Cnt
FROM (SELECT 1 AS s,
Substring(t.string1, N1.number, 1) AS c
FROM Nums N1
WHERE N1.number <= Len(t.string1)
UNION
SELECT 2 AS s,
Substring(t.string2, N2.number, 1) AS c
FROM Nums N2
WHERE N2.number <= Len(t.string2)) D1
GROUP BY c) D2
) Ca
OPTION (MAXDOP 1);
SET STATISTICS TIME OFF;
<强>结果:强>
ajb serial
--------------------------------------------------
SQL Server Execution Times:
CPU time = 656 ms, **elapsed time = 660 ms**.
ajb parallel
--------------------------------------------------
SQL Server Execution Times:
CPU time = 1281 ms, **elapsed time = 204 ms**.
M Smith serial
--------------------------------------------------
SQL Server Execution Times:
CPU time = 1390 ms, **elapsed time = 1393 ms**.
请注意,我没有使用并行计划测试Martin的解决方案,因为该查询无法并行运行。
答案 1 :(得分:2)
内联方法。
这使用数字表
CREATE TABLE dbo.Numbers (number INT PRIMARY KEY);
INSERT INTO dbo.Numbers
SELECT TOP 8000 ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM sys.all_columns c1,
sys.all_columns c2
如果您不想使用性能而不必使用性能,则编辑历史记录中会显示没有但效果较差的版本。
WITH T(S1, S2)
AS (SELECT 'aaabbcd',
'ddbca'
UNION ALL
SELECT 'abcddd',
'cda')
SELECT *
FROM T
CROSS APPLY (SELECT CASE WHEN Min(Cnt) = 2 THEN 1 ELSE 0 END AS Flag
FROM (SELECT Count(*) AS Cnt
FROM (SELECT 1 AS s,
Substring(S1, N1.number, 1) AS c
FROM dbo.Numbers N1
WHERE N1.number <= Len(S1)
UNION
SELECT 2 AS s,
Substring(S2, N2.number, 1) AS c
FROM dbo.Numbers N2
WHERE N2.number <= Len(S2)) D1
GROUP BY c) D2
) Ca
答案 2 :(得分:1)
您可以使用此'%your-search-string%'
来查找包含任何子字符串的字符串。
SELECT * FROM TableName
WHERE Name LIKE '%searchText%'
您可以使用存储过程检查字符串的字符。
CREATE PROCEDURE IsStringMatching
(
@originalString NVARCHAR(32) ,
@stringToBeChecked NVARCHAR(32),
@IsMatching BIT OUTPUT
)
AS
BEGIN
DECLARE @inputStringCount INT = LEN(@originalString);
DECLARE @loopCount INT = 0, @temp INT;
DECLARE @char VARCHAR;
SET @IsMatching = 1
WHILE @loopCount < @inputStringCount
BEGIN
SET @char = SUBSTRING(@originalString,@loopCount+1,1);
SET @temp = CHARINDEX(@char, @stringToBeChecked,1);
IF(@temp = 0)
BEGIN
SET @IsMatching = 0;
BREAK;
END
SET @loopCount = @loopCount + 1;
END;
END
您可以这样验证:
DECLARE @IsMatching BIT;
SELECT EXECUTE IsStringMatchingQ 'aaabbcd', 'ABC';
SELECT @IsMatching