HiveQL相当于!>在SQL中

时间:2016-12-21 13:56:49

标签: sql hive hiveql

我目前一直在尝试从另一个表中不存在的表中提取这些值。但是,由于连接值包含空值 - not in,not exists和left join选项似乎不起作用。

因此,有没有办法在HiveQL中应用“不大于”条件?

作为参考,这是我运行的查询,类似于不存在并且左连接..

with date_prob as 
(
    select distinct visit 
    from t1
    where dt=20161124
    and dt1!=orig_ts
),

ev_data as
(
    select distinct visit 
    from t1
    where dt=20161124
    and visit is not null
    and origts is not null 
    and uid is not null
), 

fin_data as 
(
    select x.visit 
    from ev_data x
    where x.visit not in 
    (
      select distinct visit 
      from date_prob
      and visit is not null
    ) 
)

我为左连接运行的查询 -

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id

from ev_data x
left join date_prob y

where y.id is null
;

数据示例 -

id        dt1           orig_ts
1     2016-11-24       2016-11-10
2     2016-11-24       2016-11-24 
3     2016-11-24       2010-01-01
4     2016-11-24       2017-01-01
5     2016-11-24       2016-11-24
6     2016-11-24       2016-11-25
7     2016-11-23       2016-11-23 

因此,从这张表中我想删除那些存在大于一天差异的Id。因此,查询应仅在ID等于2,5和6的情况下返回值。

1 个答案:

答案 0 :(得分:0)

如果要从另一个表中不存在的表中提取这些值,则可以使用left join并过滤where second_table_key is null。 即使键中有NULL,这也可以工作:

--this query will return records from table a that do not exist in b
select a.id
  from a left join b on a.id=b.id
 where b.id is null; --only not joined

修复了你的例子。它有效:

drop table if exists t1;
create table t1 (id int,dt1 string,           orig_ts string );
insert overwrite table t1
select 1 id,    '2016-11-24' dt1,       '2016-11-10' orig_ts union all
select 2 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all 
select 3 id,    '2016-11-24' dt1,       '2010-01-01' orig_ts union all
select 4 id,    '2016-11-24' dt1,       '2017-01-01' orig_ts union all
select 5 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all
select 6 id,    '2016-11-24' dt1,       '2016-11-25' orig_ts union all
select 7 id,    '2016-11-23' dt1,       '2016-11-23' orig_ts;

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id
from ev_data x
left join date_prob y on x.id=y.id
where y.id is null
;

OK
2
5
6
Time taken: 14.166 seconds, Fetched: 3 row(s)
hive>

按预期工作