Question

我的查询与此类似：

SELECT  CustomerId
FROM    Customer cust
WHERE   'SoseJost75G' LIKE cust.ClientCustomerId + '%' -- ClientCustomerId is SoseJost

这样做的主旨是，我从客户那里获得了一个价值ClientCustomerId，但附加了一个未知数量的额外字符。

因此，在我的示例中，客户向我SoseJost75G但我的数据库只有SoseJost（最后没有75G。）

我的查询有效。但它需要一分多钟才能运行。这是因为它无法使用ClientCustomerId上的索引。

是否有人知道改善此类查询效果的方法？

Answer 1

您可以尝试这样的事情：

DECLARE @var VARCHAR(100)='SoseJost75G';

WITH pre_selected AS
(SELECT * FROM Customer WHERE ClientCustomerId  LIKE LEFT(@var,6) + '%')
SELECT * 
FROM pre_selected WHERE @var LIKE ClientCustomerId +'%';

使用 LIKE with fix start -search将使用ClientCustomerId上的现有索引。

使用CTE，您永远不会确切地知道将执行的执行顺序，但是 - 在某些快速测试中 - 优化器首先选择将设置减少到很小的休息时间并执行重度搜索作为第二步。

如果执行顺序不符合您的预期，您可以将第一个CTE查询的结果插入声明的变量（只有具有ID的列），然后继续使用这个小表... < / p>

像这样的东西

DECLARE @var VARCHAR(100)='SoseJost75G';

DECLARE @CustIDs TABLE(ClientCustomerID VARCHAR(100));
INSERT INTO @CustIDs(ClientCustomerID)
SELECT ClientCustomerID FROM Customer WHERE ClientCustomerId LIKE LEFT(@var,6) + '%';

--Use this with an IN-clause then
SELECT ClientCustomerId 
FROM @CustIDs WHERE @var LIKE ClientCustomerID +'%'

Answer 2

因此，检查实际值的查询是超快速的（索引搜索）。因此，我将尝试运行一堆单独的select语句，直到找到匹配项。

DECLARE @customerIdSubstring varchar(255) = 'SoseJost75G'
DECLARE @customerIdSubstringLength INT
DECLARE @results TABLE 
(
    CustomerId varchar(255)
)


DECLARE @FoundResults BIT = 0;

WHILE (@FoundResults = 0)
BEGIN 

    INSERT INTO @results (CustomerId)
    SELECT  CustomerId
    FROM    Customer cust
    WHERE   CustomerId = @customerIdSubstring 


    SELECT @FoundResults = CASE 
                               WHEN EXISTS(SELECT * FROM @results) THEN CAST(1 AS BIT)
                               ELSE CAST(0 AS BIT)
                           END

    SET @customerIdSubstringLength = LEN(@customerIdSubstring)

    -- We don't want to match on fewer than 3 chars.  (May not be correct at that point.)
    IF (@customerIdSubstringLength < 3)
        BREAK;

    SET @customerIdSubstring = LEFT(@customerIdSubstring, @customerIdSubstringLength - 1)
END 

SELECT CustomerId
FROM @results

虽然我可能会多次运行查询。实践中，每个值将是3-6倍。我认为3-6索引搜索优于1搜索和1扫描。

这也有额外的好处，即只返回最多＆＃34; LIKE＆＃34;行。（意味着如果有行SanJos，则SanJost的行不会返回。）

Answer 3

如果您可以指定最小的ClientCustomerId长度，例如它永远不会少于四个字符，你可以限制结果：

WHERE ClientCustomerId like left('SoseJost75G', 4) + '%'

这里索引可用于获取匹配记录。你的标准

AND ClientCustomerId <= 'SoseJost75G' and ClientCustomerId

然后必须仅在已找到的记录中查找

。

完整的查询：

SELECT CustomerId
FROM Customer cust
WHERE ClientCustomerId like left('SoseJost75G', 4) + '%'
AND ClientCustomerId <= 'SoseJost75G' and ClientCustomerId;

BTW：您的标准也可以写成

ClientCustomerId = left('SoseJost75G', length(ClientCustomerId))

但我认为这并不比你的版本快。

Answer 4

我喜欢你的方法，Vaccano。我只是简化了一下，以防你感兴趣：

DECLARE @customerIdSubstring varchar(255) = 'SoseJost75G'
DECLARE @results TABLE 
(
    CustomerId varchar(255)
)

DECLARE @FoundResults BIT = 0
DECLARE @customerIdSubstringLength INT = LEN(@customerIdSubstring)

WHILE (@FoundResults = 0 AND @customerIdSubstringLength >= 3)
BEGIN 
    INSERT INTO @results
    SELECT  CustomerId
    FROM    Customer
    WHERE   CustomerId = @customerIdSubstring

    -- Make @FoundResults = 1 if there's at least one record
    SELECT TOP 1 @FoundResults = 1 FROM @results

    SET @customerIdSubstringLength = @customerIdSubstringLength - 1
    SET @customerIdSubstring = LEFT(@customerIdSubstring, @customerIdSubstringLength)
END

SELECT CustomerId
FROM @results

如果您完全确定只有一个ID匹配，则可以通过删除结果表来进一步简化此操作，结果表只有一行。我还在循环中删除了@customerIdSubstring的赋值：

DECLARE @customerIdSubstring varchar(255) = 'SoseJost75G'
DECLARE @customerIdFound varchar(255)

DECLARE @customerIdSubstringLength INT = LEN(@customerIdSubstring)

WHILE (@customerIdFound IS NULL AND @customerIdSubstringLength >= 3)
BEGIN 
    SELECT  @customerIdFound = CustomerId
    FROM    Customer
    WHERE   CustomerId = LEFT(@customerIdSubstring, @customerIdSubstringLength)

    SET @customerIdSubstringLength = @customerIdSubstringLength - 1
END

SELECT @customerIdFound

Answer 5

基本上，使用您的语句没有任何错误，因为您可以将其写为sargable查询。

SARG =搜索参数

sargable查询允许优化器使用索引，而对于不可搜索的查询，优化器必须扫描表中的所有行，甚至索引都可用。

最后使用％的LIKE是可以攻击的。像开头的％那样是不可思议的。在WHERE子句中应用类似LEFT（[Column]，4）+'％'的函数会使查询无法进行搜索。至少有关SARG的文件是这样说的。

[COLUMN] LIKE 'abc%' -> sargable
[COLUMN] LIKE '%abc' -> not sargable
[COLUMN] LIKE LEFT('ABCDE', 4) -> not sargable

我认为您应该在开始任何查询之前重新设计该过程。设置适当的ETL-Porcess以分隔ID和后缀。将该数据存储在单独的列中，并根据需要配置索引。然后对转换后的数据运行查询。

这是更受欢迎的首选流程，因为您不知道自己获得了哪些数据。

提高＆＃34;反向＆＃34;的性能喜欢查询

5 个答案: