如何在BigQuery中查找数组中的元素

时间:2017-03-24 01:34:33

标签: sql google-bigquery

我正在尝试搜索数组中具有某些键值对的行。我的BigQuery表中的一行看起来像这样。

{
  "ip": "192.168.1.1",
  "cookie" [
    {
      "key": "apple",
      "value: "red"
    },
    {
      "key": "orange",
      "value: "orange"
    },
    {
      "key": "grape",
      "value: "purple"
    }
  ]
}

我考虑使用隐式UNNEST或CROSS JOIN,如下所示,但它不起作用,因为取消它只会创建多个不同的行。

SELECT ip
FROM table t, t.cookie c
WHERE (c.key = "grape" AND c.value ="purple") AND (c.key = "orange" AND c.value ="orange")

This link非常接近我想要做的事情,除非他们使用的是legacy SQL而不是standardSQL

2 个答案:

答案 0 :(得分:5)

#standardSQL
SELECT ip
FROM yourTable 
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(cookie) AS pair 
  WHERE pair IN (('grape', 'purple'),  ('orange', 'orange'))
) >= 2

您可以使用以下虚拟数据进行测试

#standardSQL
WITH yourTable AS (
  SELECT '192.168.1.1' AS ip, [('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
  SELECT '192.168.1.2', [('abc', 'xyz')]
)
SELECT ip
FROM yourTable 
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(cookie) AS pair 
  WHERE pair IN (('grape', 'purple'),  ('orange', 'orange'))
) >= 2

如果您需要输出ip,如果阵列中至少有一对 - 您需要在>= 2子句中将>=1更改为WHERE

答案 1 :(得分:3)

如果确保cookie数组中没有重复对,则Mikhail的解决方案很好。但是,如果可能存在重复,那么这是另一种解决方案:

#standardSQL
WITH yourTable AS (
  SELECT 
    '192.168.1.1' AS ip,
    [('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
  SELECT
    '192.168.1.2',
    [('abc', 'xyz'), ('orange', 'orange'), ('orange', 'orange')]
)
SELECT ip
FROM yourTable t
WHERE (
  ('grape', 'purple')  IN UNNEST(t.cookie) AND
  ('orange', 'orange') IN UNNEST(t.cookie) )

仅结果

ip
-----------
192.168.1.1