计算相关记录:查询需要2分钟才能运行

时间:2014-12-12 19:55:49

标签: sql sql-server tsql

ER Sample

考虑到上图,我正在尝试选择公告以及相关信息。

  1. 公告只能有一个关联用户(创建者)
  2. 公告只能有一个州(创建者的家乡)
  3. 公告只能有一种公告类型(E.G.公告,待售等)
  4. 公告可以有0或1个与之绑定的事件
  5. 公告可以有很多喜欢
  6. 公告可以有很多评论
  7. 就州而言,一个地区可以有许多州

    使用下面的查询会使其在我点击取消按钮之前运行2分钟。我没有尝试过更多的运行它。

    SELECT TOP 10 Bulletins.Id, LEFT(Bulletins.Body, 350) AS BodySnippet, Bulletins.CreationDateTime
    , Bulletins.UserId AS PosterId, Bulletins.StateId, Bulletins.EventId,
    Bulletins.BulletinTypeId, Bulletins.[Views], Users.UserName,
    Users.Zipcode as ZipCode, Users.StateId as StateId, Users.City,
    States.Name, States.UnitedStatesRegionId, RegionsOfTheUnitedStates.Name,
    COUNT(BulletinLikes.Id) AS Likes, COUNT(BulletinComments.Id) AS Comments
    FROM Bulletins
    INNER JOIN Users ON Bulletins.UserId = Users.Id
    INNER JOIN States ON Bulletins.StateId = States.Id
    INNER JOIN RegionsOfTheUnitedStates ON States.UnitedStatesRegionId = RegionsOfTheUnitedStates.Id
    INNER JOIN BulletinTypes ON Bulletins.BulletinTypeId = BulletinTypes.Id
    LEFT JOIN [Events] ON Bulletins.EventId = [Events].Id
    LEFT JOIN BulletinLikes ON Bulletins.Id = BulletinLikes.BulletinId
    LEFT JOIN BulletinComments ON Bulletins.Id = BulletinComments.BulletinId
    GROUP BY Bulletins.Id, Bulletins.Body, Bulletins.CreationDateTime
    , Bulletins.UserId, Bulletins.StateId, Bulletins.EventId,
    Bulletins.BulletinTypeId, Bulletins.[Views], Users.UserName,
    Users.Zipcode, Users.StateId, Users.City,
    States.Name, States.UnitedStatesRegionId, RegionsOfTheUnitedStates.Name
    

    删除执行Likes和Comments计数的行会使查询立即返回。在我的表中,我有很多虚拟数据。其中一些公告有数百或数千个喜欢或评论。这似乎还不足以使查询运行2分钟加上+在TSQL方面我不是专家所以我知道它正在沸腾到我如何计算或如何分组。

    在特定情况下返回已计算的相关记录的正确方法是什么?

    **编辑1 *
    我的急诊室完全关闭了一部分。我关闭了我用来创建它并丢失它的网站。以下是一些更正

    • Bulletins与Bulletin表格中的BulletinTypeFK绑定BulletinTypes(原因是我们使用Bulletintypes作为下拉列表)

    编辑2

    我刚刚发现你可以对SQL Azure进行一些分析,并提出了这两个信息;但是,我并不是100%肯定从这些中获得什么 enter image description here
    enter image description here

    看起来好像第一次排序操作占用了54.2%的资源。第一个索引寻求看起来也很高@ 32.2%

4 个答案:

答案 0 :(得分:2)

我首先尝试检查更简单的查询的性能,该查询触及影响最大的表(您提到的BulletinLikes和BulletinComments是性能的最大违规者):

SELECT TOP 10 b.id, COUNT(bl.Id) AS likes, COUNT(bc.Id) AS Comments 
FROM Bulletins b 
LEFT JOIN BulletinLikes bl ON b.Id = bl.BulletinId
LEFT JOIN BulletinComments ON b.Id = bc.BulletinId
GROUP BY b.id 

如果它提供了不错的性能,我会使它成为子查询或CTE,无论您喜欢什么语法,并将其余部分加入子查询的结果中。

一般的想法是摆脱巨大的GROUP BY ......

旁注:TOP没有ORDER BY,不能保证给出一致的结果。

答案 1 :(得分:1)

您的查询形式没有任何问题(尽管您可能需要考虑是否需要选择这么多列,但这不是重点)。

您可能希望关注加入条件中所有列上的索引。大多数情况下,我们连接与主键存在外键关系的列,因此该列上可能存在(默认)聚簇索引,但您需要检查以确保:每个都是列应该是每个表中某个索引中的第一列(至少对于行数超过一小部分的表)。

答案 2 :(得分:1)

如果没有计数,那么甚至不需要执行那些左连接,并且查询优化器可能会将其计算出来。

你甚至没有使用计数的用户事件 - 放弃它

确保所有这些加入条件(BullitinID)都有索引,并且它们没有碎片。

当这两个查询快速运行时,您的查询将快速运行

select count(distinct(BulletinId)) from BulletinLikes  
select count(distinct(BulletinId)) from BulletinComments

(并且您可能需要regionId上的索引)

SELECT TOP 10 Bulletins.Id, LEFT(Bulletins.Body, 350) AS BodySnippet
            , Bulletins.CreationDateTime
            , Bulletins.UserId AS PosterId, Bulletins.StateId, Bulletins.EventId
            , Bulletins.BulletinTypeId, Bulletins.[Views]
            , Users.UserName, Users.Zipcode as ZipCode, Users.StateId as StateId, Users.City
            , States.Name, States.UnitedStatesRegionId
            , RegionsOfTheUnitedStates.Name
            , COUNT(BulletinLikes.Id) AS Likes
            , COUNT(BulletinComments.Id) AS Comments
FROM Bulletins
INNER JOIN Users 
   ON Bulletins.UserId = Users.Id
INNER JOIN States 
   ON Bulletins.StateId = States.Id
INNER JOIN RegionsOfTheUnitedStates 
   ON States.UnitedStatesRegionId = RegionsOfTheUnitedStates.Id
INNER JOIN BulletinTypes 
   ON Bulletins.BulletinTypeId = BulletinTypes.Id
LEFT JOIN [Events] 
  ON Bulletins.EventId = [Events].Id
LEFT JOIN BulletinLikes 
  ON Bulletins.Id = BulletinLikes.BulletinId
LEFT JOIN BulletinComments 
  ON Bulletins.Id = BulletinComments.BulletinId
GROUP BY Bulletins.Id, Bulletins.Body, Bulletins.CreationDateTime 
       , Bulletins.UserId, Bulletins.StateId, Bulletins.EventId
       , Bulletins.BulletinTypeId, Bulletins.[Views]
       , Users.UserName, Users.Zipcode, Users.StateId, Users.City
       , States.Name, States.UnitedStatesRegionId
       , RegionsOfTheUnitedStates.Name

答案 3 :(得分:0)

我会尝试将COUNT字段拉出到子查询中,并避免使用整个GROUP BY语句:

SELECT TOP 10 Bulletins.Id, LEFT(Bulletins.Body, 350) AS BodySnippet, Bulletins.CreationDateTime, Bulletins.UserId AS PosterId, Bulletins.StateId, Bulletins.EventId, Bulletins.BulletinTypeId, Bulletins.[Views], Users.UserName, Users.Zipcode as ZipCode, Users.StateId as StateId, Users.City, States.Name, States.UnitedStatesRegionId, RegionsOfTheUnitedStates.Name,
(SELECT COUNT(*) FROM BulletinLikes bl WHERE bl.BulletinId = b.Id) AS Likes,
(SELECT COUNT(*) FROM BulletinComments bc WHERE bc.BulletinId = b.Id) AS Comments
FROM Bulletins
INNER JOIN Users ON Bulletins.UserId = Users.Id
INNER JOIN States ON Bulletins.StateId = States.Id
INNER JOIN RegionsOfTheUnitedStates ON States.UnitedStatesRegionId = RegionsOfTheUnitedStates.Id
INNER JOIN BulletinTypes ON Bulletins.BulletinTypeId = BulletinTypes.Id
LEFT JOIN [Events] ON Bulletins.EventId = [Events].Id