从联合表将2计数合并为Hive SQL中的总和

时间:2018-09-13 00:07:46

标签: hive hiveql

由于Hive sql中的并集表,我很难合并计数的总和

SELECT pulocation AS locID,count(pulocation) AS puCount FROM task1 
  WHERE  distance > 0.5 AND distance < 1  
  GROUP BY pulocation 
UNION
SELECT dolocation,count(dolocation) AS doCount FROM task1 
  WHERE  distance > 0.5 AND distance < 1
  GROUP BY dolocation

会给我这张表的结果

_u2.locid   _u2.pucount
1           18
1           24  
3           3
3           4
4           4693

我试图将此表放入具有计数组合但没有成功的新表。

SELECT _u2.locid, SUM(_u2.pucount)
FROM (
SELECT pulocation AS locID,count(pulocation) AS puCount FROM task1 
  WHERE  distance > 0.5 AND distance < 1  
  GROUP BY pulocation 
UNION
SELECT dolocation,count(dolocation) AS doCount FROM task1 
  WHERE  distance > 0.5 AND distance < 1
  GROUP BY dolocation)
GROUP BY u2.locid

我尝试使用'_u2'。或“ u2”。但导致此错误

org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 14:0 Failed to recognize predicate 'GROUP'. Failed rule: 'identifier' in subquery source

我基本上想要的是这张桌子

_u2.locid   _u2.pucount
1           42
3           7       
4           4693

3 个答案:

答案 0 :(得分:0)

尝试如下:

SELECT t.locid, SUM(t.pucount)
FROM ((SELECT pulocation AS locID, count(pulocation) AS puCount
       FROM task1 
       WHERE  distance > 0.5 AND distance < 1  
       GROUP BY pulocation
      )
      UNION ALL
      (SELECT dolocation, count(dolocation) AS doCount
       FROM task1 
       WHERE  distance > 0.5 AND distance < 1
       GROUP BY dolocation
      )
     ) t
GROUP BY t.locid

答案 1 :(得分:0)

在上一个答案的帮助下,

SELECT t.locid, SUM(t.pucount) AS count
FROM (
  SELECT pulocation AS locID,COUNT(pulocation) as pucount
  FROM task1 
    WHERE  distance > 0.5 AND distance < 1  
    GROUP BY pulocation 
  UNION
  SELECT dolocation,count(dolocation) as doCount
  FROM task1 
    WHERE  distance > 0.5 AND distance < 1
    GROUP BY dolocation) AS t
GROUP BY t.locid 
ORDER BY count DESC
LIMIT 10

答案 2 :(得分:0)

这就是您需要的

SELECT locID, sum(totCount) as totCount FROM ( SELECT pulocation AS locID,count(pulocation) AS totCount FROM task1 WHERE distance > 0.5 AND distance < 1 GROUP BY pulocation UNION ALL SELECT dolocation AS locID,count(dolocation) AS totCount FROM task1 WHERE distance > 0.5 AND distance < 1 GROUP BY dolocation ) t1 GROUP BY locID