我有一个包含user_ids
,visitStartTime
和product prices
的数据,这些数据已被用户查看。我尝试获取每个用户访问的平均价格和最高价格,但我的查询未在分区(user + visitStartTime)上进行计算,而是仅通过user_id
分区进行计算。
这是我的查询:
select distinct fullVisitorId ,visitStartTime,
avg(pr) over (partition by visitStartTime,fullVisitorId) as avgPrice,
max(pr) over (partition by fullVisitorId,visitStartTime) as maxPrice
from dataset
这就是我得到的:
+-----+----------------------+-----------------+----------+----------+--+
| Row | fullVisitorId | visitStartTi | avgPrice | maxPrice | |
+-----+----------------------+-----------------+----------+----------+--+
| 1 | 64217461724617261 | 1538478049 | 484.5 | 969.0 | |
| 2 | 64217461724617261 | 1538424725 | 484.5 | 969.0 | |
+-----+----------------------+-----------------+----------+----------+--+
查询中我缺少什么?
样本数据
+---------------+----------------+---------------+
| FullVisitorId | VisitStartTime | ProductPrice |
+---------------+----------------+---------------+
| 123 | 72631241 | 100 |
| 123 | 72631241 | 250 |
| 123 | 72631241 | 10 |
| 123 | 73827882 | 70 |
| 123 | 73827882 | 90 |
+---------------+----------------+---------------+
所需结果:
+-----+---------------+--------------+----------+----------+
| Row | fullVisitorId | visitStartTi | avgPrice | maxPrice |
+-----+---------------+--------------+----------+----------+
| 1 | 123 | 72631241 | 120.0 | 250.0 |
| 2 | 123 | 73827882 | 80.0 | 90.0 |
+-----+---------------+--------------+----------+----------+
答案 0 :(得分:2)
在这种情况下,您不需要“分区依据”。
尝试一下:
select fullVisitorId ,visitStartTime, avg(ProductPrice) avgPrice ,max(ProductPrice) maxPrice
from sample
group by FullVisitorId,VisitStartTime;
(查询是非常标准的,所以我认为您可以在BigQuery中使用它)
以下是使用PostgreSQL的输出:DB<>FIDDLE
更新
还可以使用BigQuery Standard SQL:
#standardSQL
SELECT
FullVisitorId,
VisitStartTime,
AVG(ProductPrice) as avgPrice,
MAX(ProductPrice) as maxPrice
FROM `project.dataset.table`
GROUP BY FullVisitorId, VisitStartTime
如果要测试:
#standardSQL
WITH `project.dataset.table` AS (
SELECT 123 FullVisitorId, 72631241 VisitStartTime, 100 ProductPrice
UNION ALL SELECT 123, 72631241, 250
UNION ALL SELECT 123, 72631241, 10
UNION ALL SELECT 123, 73827882, 70
UNION ALL SELECT 123, 73827882, 90
)
SELECT
FullVisitorId,
VisitStartTime,
AVG(ProductPrice) as avgPrice,
MAX(ProductPrice) as maxPrice
FROM `project.dataset.table`
GROUP BY FullVisitorId, VisitStartTime