我需要帮助改进我的SQL查询以提取最近的文档数

时间:2017-11-10 21:08:06

标签: sql query-optimization azure-sql-database

这只是查询的一部分,但似乎是瓶颈:

SELECT CAST (CASE WHEN EXISTS
             (SELECT 1
              FROM dbo.CBDocument
              WHERE (FirmId = R.FirmId) AND
                    (ContributionDate > DATEADD(m, -3, GETDATE())) AND
                    ((EntityTypeId = 2600 AND EntityId = P.IProductId) OR
                    (EntityTypeId = 2500 AND EntityId = M.IManagerId)))
            THEN 1 ELSE 0 END AS BIT) AS HasRecentDocuments

FROM  dbo.CBIProduct P
  JOIN dbo.CBIManager M ON P.IManagerId = M.IManagerId
  JOIN dbo.CBIProductRating R ON P.IProductId = R.IProductId
  JOIN dbo.CBIProductFirmDetail D ON (D.IProductId = P.IProductId) AND
                                         (R.FirmId = D.FirmId)

CROSS APPLY (SELECT TOP 1 RatingDate, IProductRatingId, FirmId
             FROM  dbo.CBIProductRating
     WHERE (IProductId = P.IProductId) AND (FirmId = R.FirmId)
     ORDER BY RatingDate DESC) AS RD

WHERE (R.IProductRatingId = RD.IProductRatingId) AND (R.FirmId = RD.FirmId)

我通常会退回很多其他需要CROSS APPLY和其他联接的列。我需要优化的位是case语句中的子查询。此子查询需要3分钟才能返回119k记录。我对SQL的了解已经足够了,但必须有一种方法可以提高效率。

如果关联产品在过去3个月内有任何已添加到系统的文档,查询的要点就是返回一个标记。

编辑:我的数据库托管在Azure中,数据库调优顾问无法连接到它。 Azure中有一个调优顾问程序组件,但它没有任何建议。必须有更好的查询方法。

编辑:为了进一步简化和确定罪魁祸首,我将其简化为此查询:(而不是确定最近的文档是否存在,它只计算最近的文档。)

SELECT D.FirmId, P.IProductId,
       ,(SELECT COUNT(DocumentId) FROM dbo.CBDocument WHERE
        (FirmId = D.FirmId) AND
        (ContributionDate > DATEADD(m, -3, GETDATE())) AND
        ((EntityTypeId = 2600 AND EntityId = P.IProductId) OR
        (EntityTypeId = 2500 AND EntityId = M.IManagerId))) AS RecentDocCount

FROM dbo.CBIProduct P
FULL JOIN dbo.CBIProductFirmDetail D ON D.IProductId = P.IProductId
JOIN dbo.CBIManager M ON M.IManagerId = P.IManagerId

在3分53秒内完成。

如果我声明一个变量来存储日期(DECLARE @Today DATE = GETDATE())  并在查询(DATEADD(m, -3, @Today))中将变量放在GETDATE()的位置,它在12秒内运行。

GETDATE()是否存在已知的性能问题?据我所知,我无法在视图定义中使用该变量。

这是否能为任何可能指向解决方案的事物发光?我想我可以把整个事情变成一个存储过程,但是我还要调整应用程序代码。

感谢。

3 个答案:

答案 0 :(得分:1)

根据内容,使用左连接可能会更快:

SELECT CAST(CASE when x.FirmId is not null THEN 1 ELSE 0 END AS BIT) AS HasRecentDocuments

FROM  dbo.CBIProduct P
  JOIN dbo.CBIManager M ON P.IManagerId = M.IManagerId
  JOIN dbo.CBIProductRating R ON P.IProductId = R.IProductId
  JOIN dbo.CBIProductFirmDetail D ON (D.IProductId = P.IProductId) AND (R.FirmId = D.FirmId)


LEFT JOIN dbo.CBDocument x ON x.FirmId = R.FirmId 
                          AND x.ContributionDate > DATEADD(m, -3, GETDATE())
                          AND (   (x.EntityTypeId = 2600 AND x.EntityId = P.IProductId) 
                               OR (x.EntityTypeId = 2500 AND x.EntityId = M.IManagerId))

CROSS APPLY (SELECT TOP 1 RatingDate, IProductRatingId, FirmId
             FROM  dbo.CBIProductRating
     WHERE (IProductId = P.IProductId) AND (FirmId = R.FirmId)
     ORDER BY RatingDate DESC) AS RD

WHERE (R.IProductRatingId = RD.IProductRatingId) AND (R.FirmId = RD.FirmId)
它确实看起来更简单。

答案 1 :(得分:1)

这是您声称需要优化的查询:

SELECT CAST(CASE WHEN EXISTS (SELECT 1
                              FROM dbo.CBDocument d
                              WHERE (d.FirmId = R.FirmId) AND
                                    (d.ContributionDate > DATEADD(m, -3, GETDATE())) AND
                                    ((d.EntityTypeId = 2600 AND d.EntityId = P.IProductId) OR
                                     (d.EntityTypeId = 2500 AND d.EntityId = M.IManagerId)
                                    )
                            )
    . . . 

我会相信你的判断。我认为像这样填写查询会为您提供更多优化途径:

SELECT CAST(CASE WHEN EXISTS (SELECT 1
                              FROM dbo.CBDocument d
                              WHERE d.FirmId = R.FirmId AND
                                    d.ContributionDate > DATEADD(m, -3, GETDATE()) AND
                                    d.EntityTypeId = 2600 AND d.EntityId = P.IProductId 
                            ) OR
                      EXISTS (SELECT 1
                              FROM dbo.CBDocument d
                              WHERE d.FirmId = R.FirmId AND
                                    d.ContributionDate > DATEADD(m, -3, GETDATE()) AND
                                    d.EntityTypeId = 2500 AND d.EntityId = M.IManagerId
                            ) 
    . . . 

然后你需要CBDocument(FirmId, EntityTypeId, EntityId, ContributionDate)上的索引。

答案 2 :(得分:1)

correlated subqueriesfull outer join等操作相当昂贵,我建议寻找替代方案。虽然我不熟悉您的数据模型或数据,但我建议将“从表”更改为CBIProductFirmDetail,并进一步假设内部联接产品表,然后内部联接到产品表。如果该连接序列正确,则会删除某些外连接的开销。

代替相关子查询来确定一个计数,我建议你把它当作一个左连接的子查询。

SELECT
      d.FirmId
    , p.IProductId
    , COALESCE(Docs.RecentDocCount,0) RecentDocCount
FROM dbo.CBIProductFirmDetail d
JOIN dbo.CBIProduct p ON d.IProductId = p.IProductId
JOIN dbo.CBIManager m ON p.IManagerId = m.IManagerId
LEFT JOIN (
      SELECT
            FirmId
          , EntityId
          , EntityTypeId
          , COUNT(DocumentId) recentdoccount
      FROM dbo.CBDocument
      WHERE ContributionDate > DATEADD(m, -3, GETDATE())
      AND EntityTypeId IN (2500,2600)
      GROUP BY
            FirmId
          , EntityId
          , EntityTypeId
) AS docs ON d.FirmId = docs.FirmId
         AND (
              (docs.EntityTypeId = 2600 AND docs.EntityId = p.IProductId)
           OR (docs.EntityTypeId = 2500 AND docs.EntityId = m.IManagerId)
             )
;

将该子查询划分以避免该连接中的尴尬OR可能会有好处,所以:

SELECT
      d.FirmId
    , p.IProductId
    , COALESCE(d2500.DocCount,0) + COALESCE(d2600.DocCount,0) RecentDocCount
FROM dbo.CBIProductFirmDetail d
JOIN dbo.CBIProduct p ON d.IProductId = p.IProductId
JOIN dbo.CBIManager m ON p.IManagerId = m.IManagerId
LEFT JOIN (
      SELECT
            FirmId
          , EntityId
          , COUNT(DocumentId) doccount
      FROM dbo.CBDocument
      WHERE ContributionDate > DATEADD(m, -3, GETDATE())
      AND EntityTypeId = 2500
      GROUP BY
            FirmId
          , EntityId
) AS d2500 ON d.FirmId = d2500.FirmId
         AND m.IManagerId = d2500.EntityId
LEFT JOIN (
      SELECT
            FirmId
          , EntityId
          , COUNT(DocumentId) doccount
      FROM dbo.CBDocument
      WHERE ContributionDate > DATEADD(m, -3, GETDATE())
      AND EntityTypeId = 2600
      GROUP BY
            FirmId
          , EntityId
) AS d2600 ON d.FirmId = d2600.FirmId
           AND p.IProductId = d2600.EntityId
;