我正在尝试使用表中的子查询对hive中另一个表的每一行获取聚合结果。我知道配置单元不支持SELECT子句中的子查询,因此我尝试在FROM子句中使用子查询,但似乎配置单元也不支持相关子查询。
以下是示例:表A包含具有日期列(d1和d2)和货币列以及其他列的帐户交易数据,我想要做的是获取汇率值的总和每个帐户的日期d1和d2之间的表B(包含一年中每一天的货币汇率)。我正在尝试这样的事情:
SELECT
account_no, currn, balance,
trans_date as d2, last_trans_date as d1, exchng_rt
FROM
acc AS A,
(SELECT sum(rate) exchng_rt
FROM currency
WHERE curr_type = A.currn
AND banking_date BETWEEN A.d1 AND A.d2) AS B
以下是示例,表A具有帐户交易和日期,如:
account balance trans_date last_trans_date currency
abc 100 20-12-2016 20-11-2016 USD
abc 200 25-12-2016 20-12-2016 USD
def 500 15-11-2015 10-11-2015 AUD
def 600 20-11-2015 15-11-2015 AUD
并且表B类似于:
curr_type rate banking_date
USD 50.9 01-01-2016
USD 50.2 02-01-2016
USD 50.5 03-01-2016
AUD 50.9 01-01-2016
AUD 50.2 02-01-2016
AUD 50.5 03-01-2016 and so on...
因此,表格包含每种货币的每日货币汇率
答案 0 :(得分:0)
我认为您可以使用JOIN
和GROUP BY
执行您想要的操作:
SELECT a.account_no, a.currn, a.balance, a.trans_date as d2, a.last_trans_date as d1,
SUM(rate) as exchng_rt
FROM acc a LEFT JOIN
currency c
ON c.curr_type = a.currn and banking_date between A.d1 and A.d2
GROUP BY a.account_no, a.currn, a.balance, a.trans_date, a.last_trans_date;
答案 1 :(得分:0)
您应该在加入两个表后指定过滤器,如下所示:
SELECT A.account_no,
A.currn,
A.balance,
A.trans_date as d2,
A.last_trans_date as d1,
B.exchng_rt
FROM acc as A
JOIN (SELECT sum(rate) as exchng_rt,
curr_type,
banking_date
FROM currency group by curr_type,
banking_date ) as B
ON A.currn = curr_type
WHERE B.banking_date between A.d1 and A.d2</code>