我正在尝试将所有comment
个表(每月的评论碎片)加入posts
表。有没有办法让我在内部联接之前执行联合?有关union all运算符的详细信息,请参阅文档here。我的查询只有一个评论表如下:
SELECT c.score, c.body, c.link_id, c.parent_id, p.created_utc, c.created_utc
FROM [fh-bigquery:reddit_comments.2016_01] AS c
INNER JOIN [fh-bigquery:reddit_posts.full_corpus_201512] AS p
ON c.parent_id = p.name
WHERE SUBSTR(c.parent_id, 1, 2) = 't3'
ORDER BY c.score DESC
LIMIT 10
答案 0 :(得分:1)
替换
FROM [fh-bigquery:reddit_comments.2016_01] AS c
带
FROM (
SELECT score, body, link_id, parent_id, created_utc
FROM (TABLE_QUERY([fh-bigquery:reddit_comments],
'REGEXP_MATCH(table_id, r"\d{4}_\d{2}")'))
) AS c
希望,这会给你一个想法 详情请见Table wildcard functions和Regular expression functions
答案 1 :(得分:1)
正如Mikhail Berlyant在他的回答中指出的那样,修改查询就完成了我所需要的。
SELECT c.score, c.body, c.link_id, c.parent_id, p.created_utc, c.created_utc, (c.created_utc - p.created_utc) AS time_diff
FROM (
SELECT *
FROM
[fh-bigquery:reddit_comments.2015_11],
[fh-bigquery:reddit_comments.2015_12],
[fh-bigquery:reddit_comments.2016_01],
) AS c
INNER JOIN [fh-bigquery:reddit_posts.full_corpus_201512] AS p
ON c.parent_id = p.name
WHERE SUBSTR(c.parent_id, 1, 2) = 't3'
ORDER BY c.score DESC
LIMIT 100