我有一张表,其结构如下:
id codes 1 WrappedArray(A, B, C) 2 WrappedArray(A) 3 WrappedArray(B, D)
我想返回包含任何代码列表的行,很像SQL IN
子句。
如果我尝试
with my_table as (
select 1 as id, array('A','B','C') as codes
union
select 2 as id, array('A') as codes
union
select 3 as id, array('B', 'D') as codes
)
select *
from my_table t
lateral view explode(t.codes) as code
where code in ( 'B', 'D')
我两次获得ID 3,因为它同时包含B和D代码。
我可以做类似的事情
with my_table as (
select 1 as id, array('A','B','C') as codes
union
select 2 as id, array('A') as codes
union
select 3 as id, array('B', 'D') as codes
)
select id from my_table
where id in (
select id
from my_table sub
lateral view posexplode(sub.codes) as code_pos, code
where code in ( 'B', 'D') )
但这需要我两次引用my_table
。实际上,我的表很大,我宁愿避免本质上是自联接的事情,因为我已经有了评估主表条件所需的数据。
我想做这样的事情:
with my_table as (
select 1 as id, array('A','B','C') as codes
union
select 2 as id, array('A') as codes
union
select 3 as id, array('B', 'D') as codes
)
select id
from my_table t
where exists ( select 1 from (select 0) lateral view explode(t.codes) as code where code in ( 'B', 'D') )
但是抛出一个
在外部不支持引用外部查询的表达式 WHERE / HAVING子句
array_contains
看起来很接近我的需要,但是它只需要一个值,而不是值列表。
通常情况下,我的数据比本示例要复杂(数组元素是named_struct
,而不是简单的字符串),但是我假设我可以根据自己的情况调整任何解决方案。 / p>
在没有纯SQL中的自联接的情况下可以做到吗?