计算元素并找到最大值

时间:2019-06-13 13:23:28

标签: sql hive

我有一张这样的桌子:

+-----+-----+-----+
| uid | aid | tid |
+-----+-----+-----+
| 1   | 6   | 7   |
+-----+-----+-----+
| 2   | 6   | 7   |
+-----+-----+-----+
| 3   | 5   | 7   |
+-----+-----+-----+
| 4   | 5   | 7   |
+-----+-----+-----+
| 5   | 5   | 7   |
+-----+-----+-----+

我希望为每个提示找到一种帮助,其中包含更多元素。 例如,我知道tid 7有2次援助6,就是这样。

+

-----+-----+-------+
| tid | aid | count |
+-----+-----+-------+
| 7   | 6   | 2     |
+-----+-----+-------+
| 7   | 5   | 3     |
+-----+-----+-------+

我期望的最终结果是7 5 3,因为我想要最大数量。

我已经通过两个查询达到了期望的结果:

CREATE TABLE temp AS
SELECT tid, aid, count(aid) as c
FROM startingtable
GROUP BY tid, aid
ORDER BY tid, aid;

然后

CREATE TABLE result AS
select a.tid, a.aid, a.c
from temp a
inner join
(SELECT tid, max(c) as m
FROM temp
GROUP BY tid) b
on a.tid = b.tid and a.c = b.m
order by tid;

我只需要一个查询就可以使其正常运行。你会怎么做?

谢谢您的时间。

5 个答案:

答案 0 :(得分:1)

您可以尝试将 self-join 与子查询一起使用。

kubectl -f job.yaml apply

您可以尝试通过CREATE TABLE temp AS SELECT t1.* FROM ( SELECT tid, aid, count(aid) as cnt FROM startingtable GROUP BY tid, aid ) t1 JOIN ( SELECT tid, MAX(cnt) maxcnt FROM ( SELECT tid, aid, count(aid) as cnt FROM startingtable GROUP BY tid, aid ) t2 GROUP BY tid )t2 ON t1.tid = t2.tid and t1.cnt = t2.maxcnt 窗口函数使用CTE将数据插入表中。

Row_number

答案 1 :(得分:0)

使用rank()分析函数:

select tid, aid, c
from
(
select tid, aid, c,
       rank() over(partition by tid order by c desc) rnk --max(c) per tid ranked 1
from
(
SELECT tid, 
       aid, 
       count(aid) as c
FROM startingtable
GROUP BY tid, aid
)s
)s where rnk=1;

答案 2 :(得分:0)

您可以尝试以下方法,这似乎比其他答案要简单一些,但是我可能会遗漏一些东西。

SELECT * FROM ( 
    SELECT tid, aid, count(*) 
    FROM test  
    GROUP BY tid, aid 
    ORDER BY count(*) DESC) 
WHERE rownum = 1;

答案 3 :(得分:0)

只需尝试:

SELECT tid, aid, COUNT(*) AS elCount FROM startingtable
GROUP BY tid, aid

答案 4 :(得分:0)

表达逻辑的最简单方法是使用窗口函数:

SELECT tid, aid, cnt
FROM (SELECT tid, aid, COUNT(*) as cnt,
             ROW_NUMBER() OVER (PARTITION BY tid ORDER BY COUNT(*) DESC) as seqnum   -- or maybe RANK() 
      FROM startingtable t
      GROUP BY tid, aid
     ) t
WHERE seqnum = 1;

逻辑只需要一层子查询/ CTE。