如何缩短StackExchange的SQL查询?

时间:2018-04-13 07:49:01

标签: sql join data.stackexchange.com

我试图使用StackExchange - Query Stack Overflow SO 中跟踪我的效果,并提出了这个问题:

首先,我有一个这样的问题来列出我已回答的问题:

Select Distinct a.Id, a.CreationDate, u.DisplayName, p.Title, p.Tags, 
   CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) as Link
From Posts a 
   Inner Join Posts p On a.ParentId = p.Id 
   Inner Join Users u On a.OwnerUserId = u.Id
   Inner Join PostTags pt on p.Id = Pt.PostId
   Inner Join Tags t on pt.TagId = t.Id
Where a.OwnerUserId in (
  9461114   --Me
  )
  And a.PostTypeId = 2  -- Answer
  And p.PostTypeId = 1  -- Question
  And t.TagName in ('jquery')

然后我做了另一个查询以获得所有答案的原始分数(Upvote,Downvote,Accepted)

SELECT *
FROM (
  SELECT v.PostID as Id, vt.Name, COUNT(*) AS CNT
     FROM 
     Posts p
       INNER JOIN Votes v
         ON v.PostId = p.Id
       INNER JOIN VoteTypes vt
         ON v.VoteTypeId = vt.Id
     WHERE
     p.OwnerUserId in
     (
      9461114   --Me
      ) 
  GROUP BY 
  vt.Name, v.PostID) s_tab pivot (min(CNT)for [Name] in ([UpMod], [DownMod],[AcceptedByOriginator]))AS PVT

然后为了方便起见,我使用LEFT JOIN来合并我拥有的两个查询,这就是我想出的:

Select a.Id, a.CreationDate, a.DisplayName, a.Title, a.Tags, a.Link, b.UpMod as Upvote, b.DownMod as DownVote, b.AcceptedByOriginator as Accepted
From 
(Select Distinct a.Id, a.CreationDate, u.DisplayName, p.Title, p.Tags, 
   CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) as Link
From Posts a 
   Inner Join Posts p On a.ParentId = p.Id 
   Inner Join Users u On a.OwnerUserId = u.Id
   Inner Join PostTags pt on p.Id = Pt.PostId
   Inner Join Tags t on pt.TagId = t.Id
Where a.OwnerUserId in (
  9461114   --Me
  )
  And a.PostTypeId = 2  -- Answer
  And p.PostTypeId = 1  -- Question
  And t.TagName in ('jquery')
  ) a 
Left Join 
  (SELECT *
FROM (
  SELECT v.PostID as Id, vt.Name, COUNT(*) AS CNT
     FROM 
     Posts p
       INNER JOIN Votes v
         ON v.PostId = p.Id
       INNER JOIN VoteTypes vt
         ON v.VoteTypeId = vt.Id
     WHERE
     p.OwnerUserId in
     (
      9461114   --Me
      ) 
  GROUP BY 
  vt.Name, v.PostID) s_tab pivot (min(CNT)for [Name] in ([UpMod], [DownMod],[AcceptedByOriginator]))AS PVT
  ) b on a.Id = b.Id

这太长了,我觉得还有其他方法可以做到这一点,而不会让查询看起来像这样。

我已经尝试并测试了我的查询,并且我确信它的工作方式正如我所希望的那样。我只是希望它更短。提前谢谢!

2 个答案:

答案 0 :(得分:0)

以下查询是两个内部查询的组合(无数据透视)。

    SELECT
      a.Id,
      a.CreationDate,
      u.DisplayName,
      p.Title,
      p.Tags,
      CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) AS Link,
      vt.Name AS VoteType,
      COUNT(v.Id) AS VoteCount
    FROM 
      Users u 
      INNER JOIN Posts a ON u.Id = a.OwnerUserId AND a.PostTypeId = 2  -- Answer
      INNER JOIN Posts p ON a.ParentId = p.Id  AND p.PostTypeId = 1  -- Question
      INNER JOIN PostTags pt ON p.Id = pt.PostId
      INNER JOIN Tags t ON pt.TagId = t.Id AND t.TagName in ('jquery')
      LEFT JOIN Votes v ON a.Id = v.PostId 
      LEFT JOIN VoteTypes vt ON v.VoteTypeId = vt.Id
    WHERE u.Id = 9461114   --Me
    GROUP BY
      a.Id,
      a.CreationDate,
      u.DisplayName,
      p.Title,
      p.Tags,
      p.Id,
      vt.Name

所有内容都围绕用户表构建,并且相应的过滤器已从WHERE节直接移至JOIN。投票以LEFT JOIN直接附加在答案上。最终的透视查询如下所示:

SELECT *
FROM (
  SELECT
    a.Id,
    a.CreationDate,
    u.DisplayName,
    p.Title,
    p.Tags,
    CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) AS Link,
    vt.Name AS VoteType,
    COUNT(v.Id) AS VoteCount
  FROM 
    Users u 
    INNER JOIN Posts a ON u.Id = a.OwnerUserId AND a.PostTypeId = 2  -- Answer
    INNER JOIN Posts p ON a.ParentId = p.Id  AND p.PostTypeId = 1  -- Question
    INNER JOIN PostTags pt ON p.Id = pt.PostId
    INNER JOIN Tags t ON pt.TagId = t.Id AND t.TagName in ('jquery')
    LEFT JOIN Votes v ON a.Id = v.PostId 
    LEFT JOIN VoteTypes vt ON v.VoteTypeId = vt.Id
  WHERE u.Id = 9461114   --Me
  GROUP BY
    a.Id,
    a.CreationDate,
    u.DisplayName,
    p.Title,
    p.Tags,
    p.Id,
    vt.Name
) s_tab pivot (min(VoteCount) for [VoteType] in ([UpMod], [DownMod], [AcceptedByOriginator])) AS PVT

答案 1 :(得分:0)

您想查看有关jquery问题的答案。

在您的查询中,您正在加入标签,因此,尽管帖子中有多余的标签字符串,但必须为每个答案考虑多行。这样做是为了将结果限制为jquery问题,但是通过联接并可能乘以行,您以后需要DISTINCT来删除产生的结果。这不是好样式。您想要的是查找标签。我们使用WHEREEXISTSIN子句中查找内容。

关于获得和计数票数,最优雅的方法应该是在您的答案上使用OUTER APPLY-甚至是CROSS APPLY,因为我们正在谈论汇总。

select
  a.id, a.creationdate,
  u.displayname,
  q.title, q.tags,
  v.up, v.down, v.accepted,
  concat('http://stackoverflow.com/questions/', q.id, '#answer-', a.id) as link
from posts a
join posts q on a.parentid = q.id
join users u on a.owneruserid = u.id
cross apply
(
  select
    sum(case when vt.name = 'UpMod' then 1 else 0 end) as up,
    sum(case when vt.name = 'DownMod' then 1 else 0 end) as down,
    sum(case when vt.name = 'AcceptedByOriginator' then 1 else 0 end) as accepted
  from votes v
  inner join votetypes vt on v.votetypeid = vt.id
  where v.postid = a.id
) v
where a.owneruserid = 9461114
and a.posttypeid = 2  -- answer
and q.posttypeid = 1  -- question
and exists
(
  select *
  from posttags pt
  join tags t on t.id = pt.tagid
  where pt.postid = q.id
  and t.tagname in ('jquery')
);

这不是我们可以为任务编写的最短查询,但它可能是执行该任务的最高效的方法。我还认为此查询非常易读,因此可维护,因为您可以轻松地查看例如为什么我们检查PostTag和Tag。