我需要使用另一个表中的值更新一个表中的嵌套字段。 使用this solution我提出了一些有效的方法,但并不完全符合我的要求。 这是我的解决方案:
#standardSQL
UPDATE
`attribution.daily_sessions_20180301_copy1` AS target
SET
hits = ARRAY(
SELECT AS STRUCT * REPLACE(ARRAY(
SELECT AS STRUCT *
FROM(
SELECT AS STRUCT * REPLACE(map.category AS productCategoryAttribute) FROM UNNEST(product))) AS product) FROM UNNEST(hits)
)
FROM
`attribution.attribute_category_map`
AS map
WHERE
(
SELECT REPLACE(LOWER(prod.productCategory), 'amp;', '') FROM UNNEST(target.hits) AS h,
UNNEST(h.product) AS prod LIMIT 1) = map.raw_name
attribute_category_map是一个包含两列的表,其中我在第1列中查找相应的值,并将目标表中的数据替换为第2列中的值。我实现的最佳结果 - 更新了具有相同值的一行上的所有嵌套字段,这是仅对第一个嵌套字段进行更正,而不是使用特定值更新每个嵌套字段。
主表的简化架构:
[
{
"name":"sessionId",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"hits",
"type":"RECORD",
"mode":"REPEATED",
"fields":[
{
"name":"product",
"type":"RECORD",
"mode":"REPEATED",
"fields":[
{
"name":"productCategory",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"productCategoryAttribute",
"type":"STRING",
"mode":"NULLABLE"
}
]
}
]
}
]
会话行中通常有几个匹配,一个匹配中有几个产品。值看起来像那些(如果你不想):
-----------------------------------------------------------------------------
sessionId | hits.product.productCategory| hit.product.productCategoryAttribute
-----------------------------------------------------------------------------
1 | automotive chemicals | null
1 | automotive tools | null
1 | null | null
2 | null | null
2 | automotive chemicals | null
2 | null | null
3 | null | null
3 | bed accessories | null
4 | null | null
4 | null | null
4 | automotive chemicals | null
4 | null | null
-----------------------------------------------------------------------------
地图表的架构:
[
{
"name":"raw_name",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"category",
"type":"STRING",
"mode":"NULLABLE"
}
]
的值如下:
---------------------------------------------------
raw_name |category |
---------------------------------------------------
automotive chemicals |d1y2 - automotive chemicals|
automotive paint |dijf1 - automotive paint |
automotive tools |efw1 - automotive tools |
baby & infant toys |wwfw - baby & infant toys |
batteries & power |fdsv- batteries & power |
bed accessories |0k77 - bed accessories |
bike racks |12df - bike racks |
--------------------------------------------------
我想要的结果是:
-----------------------------------------------------------------------------
sessionId | hits.product.productCategory| hit.product.productCategoryAttribute
-----------------------------------------------------------------------------
1 | automotive chemicals | d1y2 - automotive chemicals
1 | automotive tools | efw1 - automotive tools
1 | null | null
2 | null | null
2 | automotive chemicals | d1y2 - automotive chemicals
2 | null | null
3 | null | null
3 | bed accessories | 0k77 - bed accessories
4 | null | null
4 | null | null
4 | automotive chemicals | d1y2 - automotive chemicals
4 | null | null
-----------------------------------------------------------------------------
我需要从主表中取值productCategory,在列raw_name中的map表中查找,从colum类中取值并将其放到主表的productCategoryAttribute列中。主要问题是目标字段是双嵌套的,我无法弄清楚如何直接加入它们。
答案 0 :(得分:3)
以下测试!
按原样保留整个表的模式/数据,并仅根据相应的映射更新productCategoryAttribute的值
#standardSQL
UPDATE `project.dataset.your_table` t
SET hits =
ARRAY(
SELECT AS STRUCT * REPLACE(
ARRAY(
SELECT AS STRUCT product.* REPLACE(
CASE WHEN map.raw_name = product.productCategory THEN category
ELSE productCategoryAttribute END AS productCategoryAttribute)
FROM UNNEST(product) product
LEFT JOIN UNNEST(agg_map.map) map
ON map.raw_name = product.productCategory
) AS product)
FROM UNNEST(hits) hit
)
FROM (SELECT ARRAY_AGG(row) map FROM `project.dataset.map` row) agg_map
WHERE TRUE
注意:上面的解决方案假设map表不是那么大,因为它依赖于将整个map表聚合成一个数组