在蜂巢中得到大量的数量

时间:2015-05-17 19:46:05

标签: hive hiveql

我正在尝试获取计数查询结果的平均值,在hive的文档中我读到它是不可能的,因此我尝试了: 1º

CREATE VIEW clicks_pais_totalView AS
SELECT p.pais as pais, count(1) as numeroClicks
FROM clicks_data_mat p
WHERE p.pais is not NULL
GROUP BY p.pais;

CREATE TABLE clicks_pais_total AS SELECT * FROM clicks_pais_totalView;
ALTER TABLE clicks_pais_total CHANGE numeroClicks numeroClicksInt INT;

SELECT pais as pais, avg(DISTINCT numeroclicksint)
FROM clicks_pais_total
GROUP BY pais;

avg结果总是和第一个返回我的计数查询一样,出了什么问题?

1 个答案:

答案 0 :(得分:0)

你正在做错的查询。如果计算并平均该值,结果将始终相同,因为group by是相同的。

表A

name value
A     1
A     3
B     7

如果按名称计算

select name, count(1) from tableA
group by name;

A 2
B 1

然后,如果您按名称平移值,则它保持相同,因为每个名称都有单个值,它是如何平均的。所以

SELECT pais, avg(numeroClicks) from
FROM clicks_data_mat
WHERE pais is not NULL
group by pais;