更新BigQuery表中的嵌套字段

时间:2017-07-06 15:36:38

标签: google-bigquery

我想尝试执行BigQuery中的一项微不足道的操作。我正在尝试更新BigQuery表中的嵌套字段,该字段是360导出的结果。

这是我的问题:

#standardSQL
UPDATE `dataset_name`.`ga_sessions_20170705`
SET hits.eventInfo.eventLabel = 'some string'
WHERE TRUE

但我收到此错误消息:

Error: Cannot access field eventInfo on a value with type ARRAY<STRUCT<item STRUCT<transactionId INT64, currencyCode STRING>, isEntrance BOOL, minute INT64, ...>> at [3:10]

如何更新此嵌套字段?

3 个答案:

答案 0 :(得分:3)

hits是一个数组,因此您需要使用数组子查询来分配它。它看起来像这样:

#standardSQL
UPDATE `dataset_name`.`ga_sessions_20170705`
SET hits =
  ARRAY(
    SELECT AS STRUCT * REPLACE(
      (SELECT AS STRUCT eventInfo.* REPLACE('some string' AS eventLabel)) AS eventInfo)
    FROM UNNEST(hits)
  )
WHERE TRUE;

答案 1 :(得分:1)

如果您需要修改给定的自定义维度,则可以使用以下方法:

#standardSQL
UPDATE `tablename`
SET hits = 
  ARRAY(
    SELECT AS STRUCT * REPLACE(
      ARRAY(
        SELECT AS STRUCT cd.index,
          CASE WHEN cd.index = index_number THEN 'new value'
          ELSE cd.value
          END
        FROM UNNEST(customDimensions) AS cd
      ) AS customDimensions)
    FROM UNNEST(hits) hit
  )
WHERE TRUE

但是要花一些时间。

答案 2 :(得分:0)

这里是在有电子邮件的ga会话中屏蔽PII数据

UPDATE
     `<project-id>.<dataset-name>.<table-name>`
 
SET hits =
  ARRAY(SELECT AS STRUCT * REPLACE (
    -- correcting pages here
    IF(REGEXP_CONTAINS(page.pagePath, r"@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+")
    ,STRUCT(
        REGEXP_REPLACE(page.pagePath, r"@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+", "[EMAIL]")
        ,page.pagePathLevel1
        ,page.pagePathLevel2
        ,page.pagePathLevel3
        ,page.pagePathLevel4
        ,page.hostname
        ,page.pageTitle
        ,page.searchKeyword
        ,page.searchCategory
    ), page) AS page)
    
    FROM UNNEST(hits)
  ) 
WHERE ( -- only relevant sessions
  SELECT COUNT(1) > 0 
  FROM UNNEST(hits) AS hits
  WHERE totals.visits = 1
    AND hits.type = 'PAGE'
    AND REGEXP_CONTAINS(hits.page.pagePath, r"@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+") = true
)