SQL查询-长时间运行/占用CPU资源

时间:2018-10-17 08:57:25

标签: sql sql-server database database-performance sqlperformance

您好,我有下面的SQL查询,该查询平均需要40分钟才能运行,它引用的表之一中有超过700万条记录。

我已经通过数据库调整顾问运行了此操作,并应用了所有建议,而且我已经在sql的活动监视器中评估了它,并且不建议进一步的索引等。

任何建议都会很棒,在此先感谢

WITH CTE AS 
(
    SELECT r.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    r.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM Result R
    INNER JOIN Job J ON R.JobId = J.Id
    INNER JOIN User C ON J.UserId = C.Id
    WHERE 
    ISNULL(J.Approved, CAST(0 AS BIT)) = CAST(1 AS BIT)
    AND ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
    AND ISNULL(R.Email,'') <> '' -- has an email address
    AND ISNULL(R.EmailSent, CAST(0 AS BIT)) = CAST(0 AS BIT) -- email has not been sent
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND ISNULL(R.EmailStatus,'') = '' -- email has not been sent
    AND ISNULL(R.IsEmailSubscribe, 'True') <> 'False' -- not unsubscribed
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE  ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- list of ids
    )
    AND J.Id NOT IN (
        -- list of ids
    )
    AND J.ClientId NOT IN 
    (
        -- list of ids
    )
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, ConsultantId, ConsultantEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1

请在下面查看我更新的查询:

WITH CTE AS 
(
    SELECT R.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    R.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM RESULTS R
    INNER JOIN JOB J ON R.JobId = J.Id
    INNER JOIN Consultant C ON J.UserId = C.Id
    WHERE 
    J.DCApproved = 1
    AND (J.Closed = 0 OR J.Closed IS NULL)
    AND (R.Email <> '' OR R.Email IS NOT NULL)
    AND (R.EmailSent = 0 OR R.EmailSent IS NULL)
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND (R.EmailStatus = '' OR R.EmailStatus IS NULL)
    AND (R.IsEmailSubscribe = 'True' OR R.IsEmailSubscribe IS NULL)
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL))

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.ClientId NOT IN 
    (
        -- LIST OF IDS
    )
)

INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, UserId, UserEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1


GO

2 个答案:

答案 0 :(得分:4)

在您的var UsersResult = (from user in Context.Users join userRole in Context.UserRoles on user.Id equals userRole.UserId into ur from listedUserRole in ur.DefaultIfEmpty() where (user.Id == (UserId ?? user.Id)) group listedUserRole by user into GroupedUserItem select new UsersWithAccessProfileDataView { User = My.Mapper.Map<UserDataView>(GroupedUserItem.Key), UserRoles = GroupedUserItem.FirstOrDefault() == null ? null : GroupedUserItem }); ISNULL子句中使用WHERE可能是这里的主要原因。对查询中的列使用函数会导致查询变为非SARGable(这意味着它无法使用表上的任何索引,因此可以扫描整个对象)。 注意;使用针对变量的函数,在JOIN通常是可以的。例如WHERE。诸如WHERE SomeColumn = DATEADD(DAY, @n, @SomeDate)之类的东西具有“包罗万象的查询”的味道,因此性能影响者也可以;取决于您的设置。这不是手头的讨论。

对于像WHERE SomeColumn = ISNULL(@Variable,0)这样的子句,这对于查询优化器来说是一个头疼的问题,并且您的查询中充斥着它们。您需要将这些替换为以下子句:

ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)

尽管没有什么区别,但是也不需要在那里WHERE (J.Closed = 0 OR J.Closed IS NULL) CAST。 SQL Server可以看到您正在与0作比较,因此也将bit解释为一个。

您还有一个0EXISTS子句WHERE。这将需要成为:

ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')

您需要在WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL)) 子句(CTE和子查询)中更改您的ISNULL使用情况的全部,并且应该会看到不错的性能提升。 / p>

答案 1 :(得分:1)

通常,700万条记录是现代数据库的笑话。如果您遇到问题,则应该在数十亿行而不是700万行上谈论问题。

这表示查询有问题。高CPU通常是不匹配字段(将一个表中的字符串与另一个表中的数字进行比较)或...函数调用过多的标志。长时间正常运行是缺少索引或..不可刺的迹象。您确实要强加哪些。

非可预测性意味着不能使用taht索引。这是所有示例:

ISNULL(J.Approved,CAST(0 AS BIT))= CAST(1 AS BIT)

ISNULL(field,value)表示字段上的索引不可用-通常是“再见索引,问候表扫描”。这也意味着-好吧...

(J.Approoved = 1或J.Approoved IS NULL)

的含义相同,但可更改。您的几乎所有情况都是以不可更改的方式编写的-欢迎来到db hell。开始重写。

您可能想在https://www.techopedia.com/definition/28838/sargeable上阅读有关可发性的更多信息

还要确保您在所有相关外键(和引用的主键)上都有索引-否则,再次欢迎表扫描。