使用Hive查找百分比值

时间:2016-12-01 03:08:59

标签: sql hive hiveql

我有一些表格:

Table_1:
+------------+--------------+
| Student_ID | Student_Name |
+------------+--------------+
|        000 | Jack         |
|        001 | Ron          |
|        002 | Nick         |
+------------+--------------+

Table_2:
+-----+-------+-------+
| ID  | Total | Score |
+-----+-------+-------+
| 000 |   100 |    80 |
| 001 |   100 |    80 |
| 002 |   100 |    80 |
+-----+-------+-------+

Table_3:
+-----+-------+-------+
| ID  | Total | Score |
+-----+-------+-------+
| 000 |   100 |    60 |
| 001 |   100 |    80 |
| 002 |   100 |    70 |
+-----+-------+-------+

Expected_Output:

ID  percent
000 70
001 80
002 75

我之前创建了一个配置单元表。现在,我想提出一个单独的HiveQL,以便我可以从上面3个表中获得预期的输出。 我想要做的是,在我的查询中,我将:

  1. 使用ID
  2. 使用左外连接
  3. 为每个ID
  4. 找到“总计”和“得分”的总和
  5. 将“得分”之和除以“总数”之和得到百分比。
  6. 我想出了这个:

    INSERT OVERWRITE TABLE expected_output 
    SELECT t1.Student_ID AS ID, (100*t4.SUM1/t4.SUM2) AS percent
    FROM Table_1 t1
    LEFT OUTER JOIN(
    SELECT (ISNULL(Total,0) + ISNULL(Total,0)) AS ‘SUM2’, (ISNULL(Score,0) + ISNULL(Score,0)) AS ‘SUM1’
    FROM t4
    )ON (t1.Student_ID=t2.ID) JOIN Table_3 t3 ON (t3.ID=t2.ID);
    

    而且,我现在陷入困境。不确定如何达到结果。 有什么好主意吗?

1 个答案:

答案 0 :(得分:0)

这是一个简单的join。假设在每个表t2和t3中每个id都有一行,你可以做

SELECT t2.Student_ID AS ID, 100.0*(t2.score+t3.score)/(t2.total+t3.total) AS percent
FROM Table_2 t2
JOIN Table_3 t3 ON t3.ID=t2.ID