我有完美的RDBMS表和查询。我已将数据从RDBMS卸载到HIVE表。要在HIVE上运行现有查询,我们首先需要使它们与HIVE兼容。
让我们在选择项目列表中以子查询为例。它在语法上是有效的,并且在RDBMS系统上运行良好。但它不适用于HIVE根据HIVE manual,Hive仅支持FROM和WHERE子句中的子查询。
示例1:
SELECT t1.one
,(SELECT t2.two
FROM TEST2 t2
WHERE t1.one=t2.two) t21
,(SELECT t3.three
FROM TEST3 t3
WHERE t1.one=t3.three) t31
FROM TEST1 t1 ;
示例2:
SELECT a.*
, CASE
WHEN EXISTS
(SELECT 1
FROM tblOrder O
INNER JOIN tblProduct P
ON O.Product_id = P.Product_id
WHERE O.customer_id = C.customer_id
AND P.Product_Type IN (2, 5, 6, 9)
)
THEN 1
ELSE 0
END AS My_Custom_Indicator
FROM tblCustomer C
INNER JOIN tblOtherStuff S
ON C.CustomerID = S.CustomerID ;
示例3:
Select component_location_id, component_type_code,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'AXLE'
and component_location_id = cl.component_location_id ) as AXLE,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'SIDE'
and component_location_id = cl.component_location_id ) as SIDE
from stg_dev.component_locations cl ;
我想知道选择项列表中可能的子查询替代方法,以使其与hive兼容。显然,我将能够转换HIVE格式的现有查询。
非常感谢任何帮助和指导!
答案 0 :(得分:2)
您提供的查询可以转换为LEFT JOIN
s的简单查询。
SELECT
t1.one, t2.two AS t21, t3.three AS t31
FROM
TEST1 t1
LEFT JOIN TEST2 t2
ON t1.one = t2.two
LEFT JOIN TEST3 t3
ON t1.one = t3.three
由于子查询没有限制,连接将返回相同的数据。 (对于TEST1中的每一行,子查询应该只返回一行或没有行。)
请注意,您的原始查询无法处理1..n连接。在大多数DBMS中,SELECT列表中的子查询应仅返回带有一列和一行或没有行的结果集。
答案 1 :(得分:0)
基于HIVE manual:
SELECT t1.one, t2.two, t3.three
FROM TEST1 t1,TEST2 t2, TEST3 t3
WHERE t1.one=t2.two AND t1.one=t3.three;
答案 2 :(得分:0)
SELECT t1.one,t2.two,t3.three FROM TEST1 t1 INNER
JOIN TEST2 t2 ON t1.one=t2.two INNER JOIN TEST3 t3
ON t1.one=t3.three WHERE t1.one=t2.two AND t1.one=t3.three;
答案 3 :(得分:0)
SELECT t1.one,t2.two as t21,t3.three as t31 FROM TEST1 t1
INNER JOIN TEST2 t2 ON t1.one=t2.two
INNER JOIN TEST3 t3 ON t1.one=t3.three