如果重复记录,则查询大(查询)

时间:2015-11-15 04:19:35

标签: google-bigquery google-cloud-platform

这是针对以下问题的解决方案BigQuery SQL IF over repeated record:我尝试创建一个测试表并尝试了给出的查询,但它实际上并没有选择住在两者中的人纽约和芝加哥。测试数据如下:

{"fullname": "John Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "seattle"}]}
{"fullname": "Adam Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "phil"}]}
{"fullname": "Adam Jefferson", "citiesLived": [{"place": "boston"}, {"place": "chicago"}, {"place": "seattle"}]}

,查询如下:

SELECT
  *
FROM (
  SELECT
    fullname,
    IF (citiesLived.place == 'newyork', 1, 0) AS ny,
    IF (citiesLived.place == 'chicago', 1, 0) AS chi
  FROM (FLATTEN(tester.citiesLived, citiesLived))
  OMIT
    RECORD IF citiesLived.place = 'seattle')
WHERE
  ny == 1
  AND chi == 1

2 个答案:

答案 0 :(得分:2)

你不需要做FLATTEN(一般来说,在BigQuery查询中很少需要FLATTEN),只需OMIT IF即可:

SELECT fullname FROM tester.citiesLived
OMIT RECORD IF NOT (
  SOME(citiesLived.place = "newyork") AND
  SOME(citiesLived.place = "chicago"))

OMIT IF的条件是,如果某些城市居住的是纽约,而某些城市是芝加哥 - 那么它符合您的标准。但两个都不为真的记录 - 应该省略(因此是NOT谓词)。

答案 1 :(得分:0)

我相信这将是对原始预期查询的更完整的重写:

SELECT
  *
FROM (
  SELECT
    fullname,
    SOME(citiesLived.place == 'newyork') WITHIN RECORD AS ny,
    SOME(citiesLived.place == 'chicago') WITHIN RECORD AS chi
  FROM tester.citiesLived
  OMIT
    RECORD IF SOME(citiesLived.place = 'seattle'))
WHERE
  ny == true
  AND chi == true