在reddit评论数据集中 - 哪些帖子最快上升到1000条评论?

时间:2015-08-28 14:50:42

标签: sql google-bigquery reddit

来自https://www.reddit.com/r/datasets/comments/3icas8/reddit_july_comments_are_now_available/cuiahh9?context=3

哪些提交的速度最快? (哪些提交在N秒内达到了他们的第1,000条评论 - 按N上升排序)。基本上通过关于评论量的加速来提交提交。 (我假设突发新闻故事是最快的故事。)

1 个答案:

答案 0 :(得分:2)

让我们按照发布的时间对每个帖子的评论进行排名,查看第1和第1000个,计算两者之间的时差并按顺序排列:

SELECT link_id, FIRST(IF(rank=1000,created_utc,null)) - FIRST(IF(rank=1,created_utc,null)) thousand
FROM (
  SELECT link_id, created_utc, RANK() OVER(PARTITION BY link_id ORDER BY created_utc) rank
  FROM [fh-bigquery:reddit_comments.2015_07] 
)
WHERE rank=1000 OR rank=1
GROUP BY link_id
HAVING NOT thousand IS null
ORDER BY thousand

最快:

https://www.reddit.com/r/leagueoflegends/comments/3epmvx/spoiler_team_liquid_vs_team_impulse_na_lcs_2015/[1]

最慢的:

https://www.reddit.com/r/Lollapalooza/comments/3054px/official_2015_rlollapalooza_ticket_resale_thread/[2]

第三名:

https://www.reddit.com/r/announcements/comments/3djjxw/lets_talk_content_ama/[3]