您好,我有下面的SQL查询,该查询平均需要40分钟才能运行,它引用的表之一中有超过700万条记录。
我已经通过数据库调整顾问运行了此操作,并应用了所有建议,而且我已经在sql的活动监视器中评估了它,并且不建议进一步的索引等。
任何建议都会很棒,在此先感谢
WITH CTE AS
(
SELECT r.Id AS ResultId,
r.JobId,
r.CandidateId,
r.Email,
CAST(0 AS BIT) AS EmailSent,
NULL AS EmailSentDate,
'PICKUP' AS EmailStatus,
GETDATE() AS CreateDate,
C.Id AS UserId,
C.Email AS UserEmail,
NULL AS Subject
FROM Result R
INNER JOIN Job J ON R.JobId = J.Id
INNER JOIN User C ON J.UserId = C.Id
WHERE
ISNULL(J.Approved, CAST(0 AS BIT)) = CAST(1 AS BIT)
AND ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
AND ISNULL(R.Email,'') <> '' -- has an email address
AND ISNULL(R.EmailSent, CAST(0 AS BIT)) = CAST(0 AS BIT) -- email has not been sent
AND R.EmailSentDate IS NULL -- email has not been sent
AND ISNULL(R.EmailStatus,'') = '' -- email has not been sent
AND ISNULL(R.IsEmailSubscribe, 'True') <> 'False' -- not unsubscribed
-- not already been emailed for this job
AND NOT EXISTS (
SELECT SMTP.Email
FROM SMTP_Production SMTP
WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
)
-- not unsubscribed
AND NOT EXISTS (
SELECT u.Id FROM Unsubscribe u
WHERE ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')
)
AND NOT EXISTS (
SELECT SMTP.Id FROM SMTP_Production SMTP
WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
)
AND C.Id NOT IN (
-- list of ids
)
AND J.Id NOT IN (
-- list of ids
)
AND J.ClientId NOT IN
(
-- list of ids
)
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, ConsultantId, ConsultantEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT
CTE.ResultId,
CTE.JobId,
CTE.CandidateId,
CTE.Email,
CTE.EmailSent,
CTE.EmailSentDate,
CTE.EmailStatus,
CTE.CreateDate,
CTE.UserId,
CTE.UserEmail,
NULL
FROM CTE
INNER JOIN
(
SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
FROM CTE
) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1
请在下面查看我更新的查询:
WITH CTE AS
(
SELECT R.Id AS ResultId,
r.JobId,
r.CandidateId,
R.Email,
CAST(0 AS BIT) AS EmailSent,
NULL AS EmailSentDate,
'PICKUP' AS EmailStatus,
GETDATE() AS CreateDate,
C.Id AS UserId,
C.Email AS UserEmail,
NULL AS Subject
FROM RESULTS R
INNER JOIN JOB J ON R.JobId = J.Id
INNER JOIN Consultant C ON J.UserId = C.Id
WHERE
J.DCApproved = 1
AND (J.Closed = 0 OR J.Closed IS NULL)
AND (R.Email <> '' OR R.Email IS NOT NULL)
AND (R.EmailSent = 0 OR R.EmailSent IS NULL)
AND R.EmailSentDate IS NULL -- email has not been sent
AND (R.EmailStatus = '' OR R.EmailStatus IS NULL)
AND (R.IsEmailSubscribe = 'True' OR R.IsEmailSubscribe IS NULL)
-- not already been emailed for this job
AND NOT EXISTS (
SELECT SMTP.Email
FROM SMTP_Production SMTP
WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
)
-- not unsubscribed
AND NOT EXISTS (
SELECT u.Id FROM Unsubscribe u
WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL))
)
AND NOT EXISTS (
SELECT SMTP.Id FROM SMTP_Production SMTP
WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
)
AND C.Id NOT IN (
-- LIST OF IDS
)
AND J.Id NOT IN (
-- LIST OF IDS
)
AND J.ClientId NOT IN
(
-- LIST OF IDS
)
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, UserId, UserEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT
CTE.ResultId,
CTE.JobId,
CTE.CandidateId,
CTE.Email,
CTE.EmailSent,
CTE.EmailSentDate,
CTE.EmailStatus,
CTE.CreateDate,
CTE.UserId,
CTE.UserEmail,
NULL
FROM CTE
INNER JOIN
(
SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
FROM CTE
) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1
GO
答案 0 :(得分:4)
在您的var UsersResult = (from user in Context.Users
join userRole in Context.UserRoles on user.Id equals userRole.UserId into ur
from listedUserRole in ur.DefaultIfEmpty()
where (user.Id == (UserId ?? user.Id))
group listedUserRole by user into GroupedUserItem
select new UsersWithAccessProfileDataView
{
User = My.Mapper.Map<UserDataView>(GroupedUserItem.Key),
UserRoles = GroupedUserItem.FirstOrDefault() == null ? null : GroupedUserItem
});
和ISNULL
子句中使用WHERE
可能是这里的主要原因。对查询中的列使用函数会导致查询变为非SARGable(这意味着它无法使用表上的任何索引,因此可以扫描整个对象)。 注意;使用针对变量的函数,在JOIN
中通常是可以的。例如WHERE
。诸如WHERE SomeColumn = DATEADD(DAY, @n, @SomeDate)
之类的东西具有“包罗万象的查询”的味道,因此性能影响者也可以;取决于您的设置。这不是手头的讨论。
对于像WHERE SomeColumn = ISNULL(@Variable,0)
这样的子句,这对于查询优化器来说是一个头疼的问题,并且您的查询中充斥着它们。您需要将这些替换为以下子句:
ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
尽管没有什么区别,但是也不需要在那里WHERE (J.Closed = 0 OR J.Closed IS NULL)
CAST
。 SQL Server可以看到您正在与0
作比较,因此也将bit
解释为一个。
您还有一个0
和EXISTS
子句WHERE
。这将需要成为:
ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')
您需要在WHERE (u.EmailAddress = R.Email
OR (u.EmailAddress IS NULL AND R.Email IS NULL))
子句(CTE和子查询)中更改您的ISNULL
使用情况的全部,并且应该会看到不错的性能提升。 / p>
答案 1 :(得分:1)
通常,700万条记录是现代数据库的笑话。如果您遇到问题,则应该在数十亿行而不是700万行上谈论问题。
这表示查询有问题。高CPU通常是不匹配字段(将一个表中的字符串与另一个表中的数字进行比较)或...函数调用过多的标志。长时间正常运行是缺少索引或..不可刺的迹象。您确实要强加哪些。
非可预测性意味着不能使用taht索引。这是所有示例:
ISNULL(J.Approved,CAST(0 AS BIT))= CAST(1 AS BIT)
ISNULL(field,value)表示字段上的索引不可用-通常是“再见索引,问候表扫描”。这也意味着-好吧...
(J.Approoved = 1或J.Approoved IS NULL)
的含义相同,但可更改。您的几乎所有情况都是以不可更改的方式编写的-欢迎来到db hell。开始重写。
您可能想在https://www.techopedia.com/definition/28838/sargeable上阅读有关可发性的更多信息
还要确保您在所有相关外键(和引用的主键)上都有索引-否则,再次欢迎表扫描。