在计算来自同一表的子项时的子查询与连接以及其他表

时间:2016-02-16 12:16:50

标签: sql sql-server performance

CREATE TABLE [dbo].[Comment] 
(
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [UserId] [int] NOT NULL,
  [Comment] [nvarchar](1024) NOT NULL,
  [Created] [datetime] NOT NULL CONSTRAINT [DF_Comment_Created]  DEFAULT (getdate()),
  [ContentId] [int] NOT NULL,
  [ParentId] [int] NULL,
  CONSTRAINT [PK_Comment] PRIMARY KEY CLUSTERED 
  (
    [Id] ASC
  )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
  ) ON [PRIMARY]

CREATE TABLE [dbo].[CommentLike]
(
  [CommentId] [int] NOT NULL,
  [UserId] [int] NOT NULL,
  CONSTRAINT [PK_CommentLike] PRIMARY KEY CLUSTERED 
  (
    [CommentId] ASC,
    [UserId] ASC
  )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
  ) ON [PRIMARY]

CREATE TABLE [dbo].[User]
(
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [Username] [nvarchar](64) NOT NULL,
  [FirstName] [nvarchar](50) NOT NULL,
  [LastName] [nvarchar](50) NOT NULL,
  [Email] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_User] PRIMARY KEY CLUSTERED 
(
  [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =  OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Comment (ContentId, ParentId)上有一个非聚集索引。

简而言之,Comment表格包含IdParentId的评论和子评论。 CommentLike表包含评论和子评论的喜欢(每个用户一个)。

Comment表包含大约8000行,CommentLike表包含1800。

我创建了一个查询,列出了评论(只有顶级评论),子评论计数,如计数,还有一个值,表明提供的用户是否喜欢每个评论。整个查询在where子句中的ContentId上进行过滤(只是表示另一个系统中id的唯一整数值​​)

我有一个版本使用子查询而另一个版本加入(在子查询上)。

子查询版本:

select 
  c.Id,
  c.Comment,
  c.Created,
  c.ContentId,
  (select count(Id) from Comment where ParentId = c.Id) as SubComments,
  (select count(UserId) from CommentLike where CommentId = c.Id) as Likes,
  (select count(UserId) from CommentLike where CommentId = c.Id and UserId = @currentUserId) as CurrentUserIsLiking
from Comment c
where c.ContentId = @contentId and c.ParentId is null
group by
  c.Id, c.Comment, c.Created, c.ContentId

Sub query version

加入版本:

select
  c.Id,
  c.Comment,
  c.Created,
  c.ContentId,
  isnull(c2.SubComments, 0) as SubComments,
  isnull(cl.Likes, 0) as Likes,
  isnull(cl.CurrentUserIsLiking, 0) as CurrentUserIsLiking
from Comment c
left join
(
  select
    ParentId,
    count(Id) as SubComments
  from Comment 
  group by ParentId
) as c2
on c.Id = c2.ParentId
left join
(
  select
    CommentId,
    count(UserId) as Likes,
    count(case when UserId = @currentUserId then 1 else null end) as CurrentUserIsLiking
  from CommentLike 
  group by CommentId
) as cl
on c.Id = cl.CommentId
where c.ContentId = @contentId and c.ParentId is null
group by
  c.Id, c.Comment, c.Created, c.ContentId, 
  c2.SubComments, cl.Likes, cl.CurrentUserIsLiking

Join version

平均而言,两个版本的运行速度都低于600毫秒,但子查询版本的运行速度似乎总是比加入版本快20%。

问题:

无论表包含多少行,子查询版本总是比连接版本快。我一直认为,性能明智,连接比子查询更好,在这种情况下是不是真的?由于性能很重要,我想知道是否可以对任一版本进行任何优化,以使特定版本的性能优于另一个版本?

1 个答案:

答案 0 :(得分:0)

为什么要汇总join版本的外部查询?

select c.*, 
       p.SubComments,
       coalesce(cl.Likes, 0) as Likes,
       coalesce(cl.CurrentUserIsLiking, 0) as CurrentUserIsLiking
from Comment c left join
     (select ParentId, count(*) as SubComments
      from Comment
      group by ParentId
     ) p
     on p.ParentId = c.Id
     (select CommentId, count(UserId) as Likes,
             sum(case when UserId = @currentUserId then 1 else 0
                 end) as CurrentUserIsLiking
     from CommentLike 
     group by CommentId
    ) cl
    on c.Id = cl.CommentId
where c.ContentId = @contentId and c.ParentId is null;

因为外部查询限制了注释,所以使用相关子查询的版本更快是合理的。