早上好,
我试图在大查询中转置一些数据。我已经看过其他一些在stackoverflow上问过这个的人,但是这样做的方法似乎是使用遗留sql(使用group_concat_unquoted)而不是标准的sql。我会使用遗产,但我过去曾遇到过嵌套数据的问题,所以自那以后只使用标准。
以下是我的例子,为了给出一些背景信息,我试图绘制下面的一些客户旅程:
uniqueid | page_flag | order_of_pages
A | Collection| 1
A | Product | 2
A | Product | 3
A | Login | 4
A | Delivery | 5
B | Clearance | 1
B | Search | 2
B | Product | 3
C | Search | 1
C | Collection| 2
C | Product | 3
但是,我希望转置数据,使其如下所示:
uniqueid | 1 | 2 | 3 | 4 | 5
A | Collection | Product | Product | Login | Delivery
B | Clearance | Search | Product | NULL | NULL
C | Search | Collection | Product | NULL | NULL
我尝试过使用多个左连接,但收到以下错误:
select a.uniqueid,
b.page_flag as page1,
c.page_flag as page2,
d.page_flag as page3,
e.page_flag as page4,
f.page_flag as page5
from
(select distinct uniqueid,
(case when uniqueid is not null then 1 end) as page_hit1,
(case when uniqueid is not null then 2 end) as page_hit2,
(case when uniqueid is not null then 3 end) as page_hit3,
(case when uniqueid is not null then 4 end) as page_hit4,
(case when uniqueid is not null then 5 end) as page_hit5
from `mytable`) a
LEFT JOIN (
SELECT *
from `mytable`) b on a.uniqueid = b.uniqueid
and a.page_hit1 = b.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) c on a.uniqueid = c.uniqueid
and a.page_hit2 = c.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) d on a.uniqueid = d.uniqueid
and a.page_hit3 = d.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) e on a.uniqueid = e.uniqueid
and a.page_hit4 = e.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) f on a.uniqueid = f.uniqueid
and a.page_hit5 = f.order_of_pages
Error: Query exceeded resource limits for tier 1. Tier 13 or higher required.
我已经看过使用过数组功能,但我之前从未使用过这个功能,而且我不确定这是否只是为了转换相反的方式。任何建议都会很棒。
谢谢
答案 0 :(得分:3)
for BigQuery Standard SQL
#standardSQL
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
您可以使用问题中的以下虚拟数据进行/测试
#standardSQL
WITH `mytable` AS (
SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL
SELECT 'A', 'Product', 2 UNION ALL
SELECT 'A', 'Product', 3 UNION ALL
SELECT 'A', 'Login', 4 UNION ALL
SELECT 'A', 'Delivery', 5 UNION ALL
SELECT 'B', 'Clearance', 1 UNION ALL
SELECT 'B', 'Search', 2 UNION ALL
SELECT 'B', 'Product', 3 UNION ALL
SELECT 'C', 'Search', 1 UNION ALL
SELECT 'C', 'Collection', 2 UNION ALL
SELECT 'C', 'Product', 3
)
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
结果是
uniqueid p1 p2 p3 p4 p5
A Collection Product Product Login Delivery
B Clearance Search Product null null
C Search Collection Product null null
取决于您的需求,您也可以考虑以下方法(尽管不是透视)
#standardSQL
SELECT uniqueid,
STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
如果使用与上述相同的虚拟数据运行 - 结果为
uniqueid journey
A Collection>Product>Product>Login>Delivery
B Clearance>Search>Product
C Search>Collection>Product