根据更多嵌套列从bigquery中选择

时间:2018-11-09 13:11:36

标签: sql google-bigquery

我需要在bigquery中按更多嵌套数据进行过滤,而我只能对自己的查询进行过滤。

基本上我需要这个:

with open('outfile.txt', "w") as f:
    for item in mylist:
        f.write("%s\n" % item+ '\t')

这可能吗?

我在bigquery中有如下数据,page_id不必存在:

SELECT item_id FROM table WHERE item_id IS NOT NULL AND page_id = '23784'

我的查询是:

| row | date | event      | params.key    | params.value |
-------------------------------------------------------
| 1   | 2018 | screenShow | item_id       | 1            |
                          | page_id       | 23784        |
                          | irrelevant_id | 5            |
| 2   | 2018 | screenShow | item_id       | 2            |
                          | irrelevant_id | 7            |

但这显然仅适用于一个键,我不知道如何添加page_id部分。 谢谢。

3 个答案:

答案 0 :(得分:1)

  

item_id不为null且page_id为23784的所有item_ids

以下是用于BigQuery标准SQL

#standardSQL
SELECT 
  (SELECT value FROM UNNEST(params) param WHERE key = 'item_id') item_id
FROM `project.dataset.table`
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(params) param 
  WHERE param = ('page_id', 23784)
  OR key = 'item_id'
  ) = 2  

您可以使用以下虚拟数据进行测试,玩

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 2018 dt, 'screenShow' event, 
    [STRUCT<key STRING, value INT64>('item_id', 1), ('page_id', 23784), ('irrelevant_id', 5)] params UNION ALL
  SELECT 2018 dt, 'screenShow' event, 
    [STRUCT<key STRING, value INT64>('item_id', 2), ('irrelevant_id', 7)] params UNION ALL
  SELECT 2018 dt, 'screenShow' event, 
    [STRUCT<key STRING, value INT64>('item_id2', 1), ('page_id', 23784), ('irrelevant_id', 5)] params 
)
SELECT 
  (SELECT value FROM UNNEST(params) param WHERE key = 'item_id') item_id
FROM `project.dataset.table`
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(params) param 
  WHERE param = ('page_id', 23784)
  OR key = 'item_id'
  ) = 2

有结果

Row item_id  
1   1       

很显然,如果只需要列出item_id,则需要整行-您只需使用SELECT *,如下所示

#standardSQL
SELECT *
FROM `project.dataset.table`
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(params) param 
  WHERE param = ('page_id', 23784)
  OR key = 'item_id'
  ) = 2  

在这种情况下,您将获得

| row | date | event      | params.key    | params.value |
-------------------------------------------------------
| 1   | 2018 | screenShow | item_id       | 1            |
                          | page_id       | 23784        |
                          | irrelevant_id | 5            |

答案 1 :(得分:0)

尝试以下操作:

SELECT
  (SELECT x.value FROM UNNEST(params) AS x WHERE x.key = 'item_id') AS item_id
FROM
  `your_dataset.your_table`
WHERE
  EXISTS (
  SELECT
    *
  FROM
    UNNEST(params) AS x
  JOIN
    UNNEST (params) AS y
  WHERE
    x.key = 'item_id'
    AND x.value IS NOT NULL
    AND y.key = 'page_id'
    AND y.value=23784)

答案 2 :(得分:0)

好吧,你可以这样做:

select t.*
from t
where exists (select 1 from unnest(params) p where p.key = 'item_id' and p.value is not null) and
      exists (select 1 from unnest(params) p where p.key = 'page_id' and p.value = 23784);