我的数据目前看起来像这样:
我想要的输出是这样的:
期望的成就是:
我通过将split函数与row_number结合使用来完成此操作。这似乎在客户的基础上起作用,但似乎回归不可靠"洗牌" row_numbers在多个客户上运行时,即错误的订单。
select
CustomerID,
OrderID,
OrderDescriptionItem,
row_number() over(partition by CustomerID, OrderID) as OrderDescriptionPosition
from
(
select
CustomerID,
OrderID,
split(OrderDescription, ',') as OrderDescriptionItem
from
InitialTable
) as e
,unnest(OrderDescriptionItem) as OrderDescriptionItem
有没有人有更强大的解决方案?任何使用UDF和javascript的建议都欢迎。
答案 0 :(得分:1)
您可以将WITH OFFSET
与UNNEST
结合使用以获取排名。这是一个例子:
#standardSQL
WITH Input AS (
SELECT 1 AS CustomerID, 1001 AS OrderID, '12,14,16,22,28' AS OrderDescription UNION ALL
SELECT 2 AS CustomerID, 1002 AS OrderID, '1,5' AS OrderDescription UNION ALL
SELECT 3 AS CustomerID, 1003 AS OrderID, '44,55,66' AS OrderDescription
)
SELECT
CustomerID,
OrderID,
OrderDescription,
off + 1 AS OrderDescriptionPosition
FROM Input
CROSS JOIN UNNEST(SPLIT(OrderDescription)) AS OrderDescription
WITH OFFSET off;
+------------+---------+------------------+--------------------------+
| CustomerID | OrderID | OrderDescription | OrderDescriptionPosition |
+------------+---------+------------------+--------------------------+
| 1 | 1001 | 12 | 1 |
| 1 | 1001 | 14 | 2 |
| 1 | 1001 | 16 | 3 |
| 1 | 1001 | 22 | 4 |
| 1 | 1001 | 28 | 5 |
| 2 | 1002 | 1 | 1 |
| 2 | 1002 | 5 | 2 |
| 3 | 1003 | 44 | 1 |
| 3 | 1003 | 55 | 2 |
| 3 | 1003 | 66 | 3 |
+------------+---------+------------------+--------------------------+
答案 1 :(得分:0)
如果您的示例代表您的真实用例(在OrderDescription的意义上是一个有序的值列表) - 您可以使用您的查询版本 - 只需在OVER()中添加ORDER BY如下
#standardSQL
WITH InitialTable AS (
SELECT 1 AS CustomerID, 1001 AS OrderID, '12,14,16,22,28' AS OrderDescription UNION ALL
SELECT 2, 1002, '1,5' UNION ALL
SELECT 3, 1003, '44,55,66'
)
SELECT
CustomerID,
OrderID,
OrderDescription,
ROW_NUMBER() OVER(PARTITION BY CustomerID, OrderID ORDER BY OrderDescription) AS OrderDescriptionPosition
FROM InitialTable, UNNEST(SPLIT(OrderDescription)) AS OrderDescription
-- ORDER BY CustomerID, OrderID, OrderDescriptionPosition