使用-SQL(Impala)计算两个表之间的百分比

时间:2017-11-02 08:37:49

标签: sql impala

我正在研究Impala(Cloudera),我有两个表,即客户和安排。客户表格包含以下列:

customercrs | customertype| 
------------+-------------+
 1000       | NP          |  
 100000     | NP          |   
 100001     | NP          |  
 100002     | GROUP       |  
 100023     | GROUP       |
 100024     | INDIRECT    |

安排表:

customercrs | arrangementid| 
------------+--------------+
 1000       | 11000000361  |  
 100000     | 11000000370  |  
 100000     | 11000000434  |
 100000     | 11000000426  |
 100001     | 11000000418  | 
 100001     | 11000000400  |
 100001     | 11000000396  |
 100001     | 11000000388  |
 100002     | 11000000591  |  
 100002     | 11000000582  |
 100023     | 11000000574  |
 100024     | 11000000566  |
 100024     | 11000000558  |

我想计算每个客户类型的安排百分比。类似的东西:

customertype | percentage  |
-------------+-------------+
 NP          | 62%         |
 GROUP       | 23%         |
 INDIRECT    | 15%         |

我尝试了以下sql查询,但它没有用。有什么想法吗?

select customertype, count(*)/(select count(*) from arrangements)
from customers as a, arrangements_sample as b
where a.customercrs = b.customercrs
group by a.customertype

感谢!!!

3 个答案:

答案 0 :(得分:3)

我会将窗口函数与显式JOIN一起使用,但是,您的解决方案似乎没问题(对于其他DBMS而不是Impala)

select customertype, 
       (count(*) * 100) / sum(count(*)) over () percentage
from customers as a
join arrangements_sample as b on a.customercrs = b.customercrs
group by a.customertype

答案 1 :(得分:2)

尝试加入subselect,我使用max作为组函数,但min或avg也会起作用...

select customertype, count(*)/max(c.total)
from customers as a, arrangements_sample as b, (select count(*) as total from 
arrangements) as c
where a.customercrs = b.customercrs
group by a.customertype

答案 2 :(得分:0)

您需要参与每个自定义类型的总计数。所以请尝试以下查询。

选择main.customertype,cast((cast(main.participation as decimal(10,2))/ main.total)* 100 as decimal(10,2))作为参与  from(select customertype,COUNT(1)作为参与,(从安排中选择COUNT(1))作为安排a的总数 内部联接客户b b.customercrs = a.customercrs b.customertype)分组为主要