查找每天至少提交(从比赛的第一天开始)的唯一黑客总数,并找到每天提交最大次数的唯一黑客的hacker_id和名称。如果一个以上的黑客拥有最多的提交数量,请打印最低的hacker_id。查询应在比赛的每一天打印此信息,并按日期排序。
以下是示例数据: 黑客表:
var AWS = require('aws-sdk');
我下面的查询没有给我唯一的hacker_ids
15758 Rose
20703 Angela
36396 Frank
38289 Patrick
44065 Lisa
53473 Kimberly
62529 Bonnie
79722 Michael
Submissions table:
Submission_date submission_id hacker_id score
3/1/2016 8494 20703 0
3/1/2016 22403 53473 15
3/1/2016 23965 79722 60
3/1/2016 30173 36396 70
3/2/2016 34928 20703 0
3/2/2016 38740 15758 60
3/2/2016 42769 79722 25
3/2/2016 44364 79722 60
3/3/2016 45440 20703 0
3/3/2016 49050 36396 70
3/3/2016 50273 79722 5
3/4/2016 50344 20703 0
3/4/2016 51360 44065 90
3/4/2016 54404 53473 65
3/4/2016 61533 79722 45
3/5/2016 72852 20703 0
3/5/2016 74546 38289 0
3/5/2016 76487 62529 0
3/5/2016 82439 36396 10
3/5/2016 90006 36396 40
3/6/2016 90404 20703 0
for the above data, expected results is:
2016-03-01 4 20703 Angela
2016-03-02 2 79722 Michael
2016-03-03 2 20703 Angela
2016-03-04 2 20703 Angela
2016-03-05 1 36396 Frank
2016-03-06 1 20703 Angela
如何在上述结果中获得唯一的hacker_id?
答案 0 :(得分:0)
您可以使用两种聚合级别:
select s.submission_date, count(*) as num_hackers, sum(cnt) as num_hacks,
max(case when seqnum = 1 then h.hacker_id end) as hacker_id,
max(case when seqnum = 1 then h.name end) as name,
from (select s.submission_date, s.hacker_id, count(*) as cnt
row_number() over(partition by s.submission_date order by count(*) desc) as seqnum
from submissions s
group by s.submission_date, s.hacker_id
) s join
hackers h
on h.hacker_id = s.hacker_id
group by s.submission_date;
请注意,子查询按日期和hacker_id
进行汇总,因此每个hacker_id
上每个日期都有一行。外部查询中的count(*)
对这些行进行计数,这是黑客的数量。我包括了骇客数量的计数。
编辑:
我意识到您可以在子查询中执行其他分析功能,这将使逻辑稍微简化:
select s.submission_date, s.num_hackers, num_hacks,
h.hacker_id, h.name
from (select s.submission_date, s.hacker_id, count(*) as cnt,
sum(count(*)) over (partition by s.submission_date) as num_hacks,
count(*) over (partition by s.submission_date) as num_hackers,
row_number() over(partition by s.submission_date order by count(*) desc) as seqnum
from submissions s
group by s.submission_date, s.hacker_id
) s join
hackers h
on h.hacker_id = s.hacker_id
where seqnum = 1;
答案 1 :(得分:0)
select big_1.submission_date, big_1.hkr_cnt, big_2.hacker_id, h.name
from
(select submission_date, count(distinct hacker_id) as hkr_cnt
from
(select s.*
, dense_rank() over(order by submission_date) as date_rank
--, row_number() over(order by submission_date) as rn_date_rank
,dense_rank() over(partition by hacker_id order by submission_date) as hacker_rank
--,row_number() over(partition by hacker_id order by submission_date) as rn_hacker_rank
from submissions s ) a
where a.date_rank = a.hacker_rank
group by submission_date) big_1
join
(select submission_date,hacker_id,
rank() over(partition by submission_date order by sub_cnt desc, hacker_id) as max_rank
from (select submission_date, hacker_id, count(*) as sub_cnt
from submissions
group by submission_date, hacker_id) b ) big_2
on big_1.submission_date = big_2.submission_date and big_2.max_rank = 1
join hackers h on h.hacker_id = big_2.hacker_id
order by 1 ;
答案 2 :(得分:-1)
select big_1.submission_date, big_1.hkr_cnt, big_2.hacker_id, h.name
from
(select submission_date, count(distinct hacker_id) as hkr_cnt
from
(select s.*
, dense_rank() over(order by submission_date) as date_rank
--, row_number() over(order by submission_date) as rn_date_rank
,dense_rank() over(partition by hacker_id order by submission_date) as hacker_rank
--,row_number() over(partition by hacker_id order by submission_date) as rn_hacker_rank
from submissions s ) a
where a.date_rank = a.hacker_rank
group by submission_date) big_1
join
(select submission_date,hacker_id,
rank() over(partition by submission_date order by sub_cnt desc, hacker_id) as max_rank
from (select submission_date, hacker_id, count(*) as sub_cnt
from submissions
group by submission_date, hacker_id) b ) big_2
on big_1.submission_date = big_2.submission_date and big_2.max_rank = 1
join hackers h on h.hacker_id = big_2.hacker_id
order by 1 ;