我试图在Hive上运行以下查询:
SELECT COUNT(*)
FROM mydata
WHERE store NOT IN (SELECT store_out
FROM ( SELECT a.store as store_out, COUNT(*) AS CNT
FROM mydata a
GROUP BY store) TB1
WHERE CNT > AVG(CNT) + STDDEV(CNT) AND CNT < AVG(CNT) - STDDEV(CNT))
但我收到以下错误:
Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 3:6 Unsupported SubQuery Expression 'store': Correlating expression cannot contain unqualified column references.
如何以其他方式编写此查询?
谢谢!
答案 0 :(得分:1)
我没有确切的数据,因此很难对此进行验证,但我会做类似的事情
SELECT COUNT(*)
FROM (
SELECT a.*
, flg
FROM mydata a
LEFT OUTER JOIN (
SELECT store_out, flg
FROM (
SELECT store_out
, cnt
, 1 AS flg
, AVG(cnt) OVER () AS avg_cnt
, STDDEV_SAMP(cnt) OVER () AS std_cnt
FROM (
SELECT store AS store_out
, COUNT(*) AS cnt
FROM mydata
GROUP BY store ) x
) y
WHERE cnt > avt_cnt + std_cnt AND cnt < avg_cnt - std_cnt ) z
ON a.store = z.store_out ) final
WHERE flg IS NULL
基本上,左边连接子查询并创建一个虚拟列。该列不会存在于主表中,因此所有flg值都为NULL,这些是您想要的存储。希望这会有所帮助。