如何提取出现值的出现次数作为MAX?

时间:2019-06-18 04:36:51

标签: sql google-bigquery

我有一个汇总表,如下所示

user_id     service     no_of_trx
1           A           56
1           C           43
1           B           22
2           C           10
2           A           3
3           B           45
3           C           7
4           A           77
4           B           63

它总结了user_id使用的所有不同类型的服务,并按每个服务进行的交易次数排序。如何提取每个服务显示为顶级服务的次数?预期结果

service     occurrence_as_max
A           2
B           1
C           1

因为服务A是用户1和4的顶级服务,服务B和C分别是用户3和2的顶级服务。

到目前为止我所拥有的:

WITH a as

(SELECT user_id, service, count(service) no_of_trx
FROM transactions
GROUP BY user_id, service
ORDER BY no_of_trx desc),

b as
(SELECT distinct(user_id) user, max(no_of_trx) occurrence_as_max
FROM a
GROUP BY user_id
ORDER by user)


SELECT distinct(service), b.occurrence_as_max
FROM b
LEFT JOIN a ON a.user_id=b.user.
ORDER by b.occurrence_as_max desc;

但这显然行不通。

3 个答案:

答案 0 :(得分:2)

下面的脚本应该起作用。这是标准的查询语法。您可能需要在BigQuery中进行一些调整,但逻辑应该可以。

SELECT A.service, COUNT(*)
FROM your_table A
INNER JOIN 
(
    SELECT user_id, MAX(no_of_trx) no_of_trx
    FROM your_table
    GROUP BY user_id
)B ON A.user_id = B.user_id 
AND A.no_of_trx = B.no_of_trx
GROUP BY A.service

答案 1 :(得分:0)

以下是用于BigQuery标准SQL(无需任何自联接)

#standardSQL
SELECT service, COUNT(1) AS occurrence_as_max
FROM (
  SELECT STRING_AGG(service ORDER BY no_of_trx DESC LIMIT 1) service
  FROM `project.dataset.table`
  GROUP BY user_id
)
GROUP BY service

您可以使用问题中的示例数据来进行测试,如上示例所示

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 user_id, 'A' service, 56 no_of_trx UNION ALL
  SELECT 1, 'C', 43 UNION ALL
  SELECT 1, 'B', 22 UNION ALL
  SELECT 2, 'C', 10 UNION ALL
  SELECT 2, 'A', 3 UNION ALL
  SELECT 3, 'B', 45 UNION ALL
  SELECT 3, 'C', 7 UNION ALL
  SELECT 4, 'A', 77 UNION ALL
  SELECT 4, 'B', 63 
)
SELECT service, COUNT(1) AS occurrence_as_max
FROM (
  SELECT STRING_AGG(service ORDER BY no_of_trx DESC LIMIT 1) service
  FROM `project.dataset.table`
  GROUP BY user_id
)
GROUP BY service
-- ORDER BY service 

有结果

Row service occurrence_as_max    
1   A       2    
2   B       1    
3   C       1    

答案 2 :(得分:0)

我将为此使用常规窗口功能:

select service, countif(seqnum = 1)
from (select t.*,
             row_number() over (partition by user_id order by no_of_trx desc) as seqnum
      from t
     ) t
group by service;

如果您希望计算领带,则使用rank()代替row_number()