我正在尝试为Reddit的每个帖子按得分获得前5条评论。我只想按分数检索每个帖子标题的前N条评论。
示例:我只想为每个帖子添加评论1和2。
Post 1 | Comment 1 | Comment Score 10
Post 1 | Comment 2 | Comment Score 9
Post 1 | Comment 3 | Comment Score 8
Post 2 | Comment 1 | Comment Score 10
Post 2 | Comment 2 | Comment Score 9
Post 2 | Comment 3 | Comment Score 8
StandardSQL
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id
FROM
`fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE
posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
ORDER BY
posts.score DESC,
posts.title DESC,
comments.score DESC
答案 0 :(得分:3)
以下是用于BigQuery标准SQL
#standardSQL
SELECT * EXCEPT(pos) FROM (
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id,
ROW_NUMBER() OVER(PARTITION BY posts.url ORDER BY comments.score DESC) pos
FROM `fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
)
WHERE pos < 3
ORDER BY postsscore DESC, title DESC, commentsscore DESC