如何在HiveQL中有效实现GROUP BY组合?

时间:2015-05-21 03:44:14

标签: sql group-by hive combinations hiveql

例如,我有下表名为Roll:

ID    Name    Address
---------------------
01    Lily    NewYork
02    Lucy    NewYork
03    Lucy    NewYork

我希望获得COUNT(1)GROUP BY组合名称和地址:

SELECT Name, Address, COUNT(1) FROM Roll GROUP BY Name, Address
+
SELECT Name, COUNT(1) FROM Roll GROUP BY Name
+
SELEC Address, COUNT(1) FROM Roll GROUP BY Address
+
SELECT COUNT(1) FROM Roll

以下SQL可以实现我的想法,'##'代表'GROUP BY NONE':

SELECT Name, Address, COUNT(1) FROM (
SELECT Name, Address FROM Roll
UNION ALL
SELECT '##', Address FROM Roll
UNION ALL
SELECT Name, '##' FROM Roll
UNION ALL
SELECT '##', '##' FROM Roll) t
GROUP by Name, Address;

结果:

+------+---------+----------+
| Name | Address | COUNT(1) |
+------+---------+----------+
| ##   | ##      |        3 |
| ##   | NewYork |        3 |
| Lily | ##      |        1 |
| Lily | NewYork |        1 |
| Lucy | ##      |        2 |
| Lucy | NewYork |        2 |
+------+---------+----------+

除了上面的方法之外,更有效的方法是实现吗?

感谢。

2 个答案:

答案 0 :(得分:1)

你正在寻找小计吗?如果是这样,可以通过分组集和立方体/汇总来实现。 check this wiki about grouping

答案 1 :(得分:0)

SELECT coalesce(Name,"##"), coalesce(Address,"##"), count(1)
FROM ROLL
GROUP BY Name, Address with cube;

我想这就是你要找的东西:)