我有一些数据定期加载到BigQuery数据库中。每行代表执行订单的一部分库存的移动。
相关的订单和“ order_product”信息将作为嵌套记录嵌入行中。
下面是一些行的示例:
|----------|----------|----------|------------------|------------------------|---------------------|---------------------------------|
| move_id | quantity | order.id | order_product.id | order_product.quantity | order_product.price | item_reference_number |
|----------|----------|----------|------------------|------------------------|---------------------|---------------------------------|
| 1 | 1 | 1 | 1 | 1 | 5 | ABC |
|----------|----------|----------|------------------|------------------------|---------------------|---------------------------------|
| 2 | 1 | 1 | 2 | 1 | 7 | DEF |
|----------|----------|----------|------------------|------------------------|---------------------|---------------------------------|
| 3 | 1 | 1 | 2 | 1 | 7 | XYZ |
|----------|----------|----------|------------------|------------------------|---------------------|---------------------------------|
如您所见,该表显示了三个库存移动。全部与订单1相关。
订单1由一个订单产品1和一个订单产品2组成。订单产品1由一个库存移动组成,物料ABC的move_id 1。
Order_product 2由两个库存移动组成,项目DEF的move_id 2和项目XYZ的move_id 3。
如何编写查询以将该数据转换为具有正确的嵌套/重复字段的表?换句话说,我希望数据看起来像这样:
|-----------|------------------|------------------------|---------------------|------------------------|---------------------|----------------------------------|
| order_id | order_product.id | order_product.quantity | order_product.price | stock_move.id | stock_move.quantity | stock_move.item_reference_number |
|-----------|------------------|------------------------|---------------------|------------------------|---------------------|----------------------------------|
| 1 | 1 | 1 | 5 | 1 | 1 | ABC |
| |------------------|------------------------|---------------------|------------------------|---------------------|----------------------------------|
| | 2 | 1 | 7 | 2 | 1 | DEF |
| | | | |------------------------|---------------------|----------------------------------|
| | | | | 3 | 1 | XYZ |
|-----------|------------------|------------------------|---------------------|------------------------|---------------------|----------------------------------|
我一直在阅读this post,这似乎表明ARRAY_AGG
可能是我所需要的,但我不知道如何正确使用它来解决我的问题。
我认为我的问题是我正在努力将嵌套的order_products每行减少到一排,同时为每种order_product填充正确的嵌套/重复库存移动。
是否有可能按照我的要求去做?我非常感谢您为我指明正确的方向提供了帮助。
答案 0 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
SELECT order_id,
ARRAY_AGG(product ORDER BY product.id) order_product,
ARRAY_CONCAT_AGG(stock_move) stock_move
FROM (
SELECT order_id,
STRUCT(order_product.id, order_product.quantity, order_product.price) product,
ARRAY_AGG(STRUCT(move_id AS id, quantity AS quantity, item_reference_number AS item_reference_number)) stock_move
FROM `project.dataset.table`
GROUP BY order_id, order_product.id, order_product.quantity, order_product.price
ORDER BY order_product.id -- <-- this is to make sure stock_move array is ordered as in your expected output - but really not needed here
)
GROUP BY order_id
应用于样本数据时-上面的结果如下
我不确定这是否就是您的意思,因为您的示例仍然有点模棱两可,但是希望这能给您一个想法
还请注意:我假设您的示例中的order.id
实际上是order_id
,否则就没有多大意义,但是我对此可能是错误的(因为我提到您的示例仍然模棱两可”一点”)
答案 1 :(得分:1)
以下SQL能否满足您的期望?
我以order_product方式创建了stock_move。
WITH original_table AS (
SELECT 1 AS move_id, 1 AS quantity, STRUCT(1 AS id) AS `order`, STRUCT(1 AS id, 1 AS quantity, 5 AS price) AS order_product, "ABC" AS item_reference_number
UNION ALL
SELECT 2 AS move_id, 1 AS quantity, STRUCT(1 AS id) AS `order`, STRUCT(2 AS id, 1 AS quantity, 7 AS price) AS order_product, "DEF" AS item_reference_number
UNION ALL
SELECT 3 AS move_id, 1 AS quantity, STRUCT(1 AS id) AS `order`, STRUCT(2 AS id, 1 AS quantity, 7 AS price) AS order_product, "XYZ" AS item_reference_number
),
t1 AS (
SELECT DISTINCT
move_id,
quantity,
`order`.id AS order_id,
order_product.id AS order_product_id,
order_product.quantity AS order_product_quantity,
order_product.price AS order_product_price,
item_reference_number
FROM original_table
),
t2 AS (
SELECT
order_id,
order_product_id,
order_product_quantity,
order_product_price,
ARRAY_AGG(STRUCT(move_id, quantity, item_reference_number) ORDER BY move_id) AS stock_move
FROM t1
GROUP BY order_id, order_product_id, order_product_quantity, order_product_price
),
t3 AS (
SELECT
order_id,
ARRAY_AGG(STRUCT(order_product_id AS id, order_product_quantity AS quantity, order_product_price AS price, stock_move) ORDER BY order_product_id) AS order_product
FROM t2
GROUP BY order_id
)
SELECT * FROM t3
|-----------|------------------|------------------------|---------------------|----------------------------------|-----------------------------------|------------------------------------------------|
| order_id | order_product.id | order_product.quantity | order_product.price | order_product.stock_move.move_id | order_product.stock_move.quantity | order_product.stock_move.item_reference_number |
|-----------|------------------|------------------------|---------------------|----------------------------------|-----------------------------------|------------------------------------------------|
| 1 | 1 | 1 | 5 | 1 | 1 | ABC |
| |------------------|------------------------|---------------------|----------------------------------|-----------------------------------|------------------------------------------------|
| | 2 | 1 | 7 | 2 | 1 | DEF |
| | | | |----------------------------------|-----------------------------------|------------------------------------------------|
| | | | | 3 | 1 | XYZ |
|-----------|------------------|------------------------|---------------------|----------------------------------|-----------------------------------|------------------------------------------------|