BigQuery如何在JSON结构中获取值的总和?

时间:2017-11-30 16:36:56

标签: json google-bigquery standard-sql

我有以下查询

SELECT 
    JSON_EXTRACT(json, '$.Weights') as weight 
from 
(select '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' as json)

返回:

  

{ “蓝色”:1.0, “紫色”:0.0, “黄色”:1.0, “绿色”:1.0}

我想看看是否有办法总结所有颜色的值。返回的意思:

  

3.0

我一直在尝试使用拆分和不需要的功能而没有任何成功,有什么建议吗?感谢。

2 个答案:

答案 0 :(得分:1)

以下是使用REGEXP_EXTRACT_ALL的示例:

WITH T AS (
  SELECT '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json
)
SELECT
  (
    SELECT SUM(CAST(val AS FLOAT64))
    FROM UNNEST(
      REGEXP_EXTRACT_ALL(
        JSON_EXTRACT(json, '$.Weights'),
        r':([^,}]+)')
    ) AS val
  )
FROM T;

答案 1 :(得分:1)

为了探索其他选择 -

以下是BigQuery Standard SQL

第一个例子是为每一行提取key:value对

ViewDragHelper

这会给你以下结果

#standardSQL
WITH `project.dataset.yourTbale` AS (
  SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
  UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
)
SELECT id,
  REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') color, 
  SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64) value
FROM `project.dataset.yourTbale`, 
UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair

所以现在很容易将上面的问题扩展到id color value 1 blue 1.0 1 purple 0.0 1 yellow 1.0 1 green 1.0 2 blue 1.0 2 red 2.0 2 yellow 1.0 2 orange 3.0 的原始问题,甚至可以通过对特定颜色进行过滤来扩展它 - 请参阅下面的示例

if there is a way to sum up all the values of the colors

结果如下(从calc中排除颜色=蓝色)

#standardSQL
WITH `project.dataset.yourTbale` AS (
  SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
  UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
)
SELECT id,
  SUM(SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64)) AS total
FROM `project.dataset.yourTbale`, 
UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair
WHERE REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') != 'blue'
GROUP BY id