如何在sql代码中添加逻辑语句?

时间:2019-09-04 19:23:46

标签: sql apache-spark apache-spark-sql

我创建了一个临时表,在其中插入了字段(具有表示一个属性的多个值的字段,现在我想创建一个逻辑,在其中比较这些属性并创建一个新字段以总结ref_type和post_campaign字段。

我正在尝试根据以下逻辑/条件创建一个新列(x):

> > if post_campaign starts with KNC-% and ref_type = 3 then create a new
column (x) with with field PS 
> > if post_campaign is null and ref_type = 3, then create a new column (x) with field OS 
> > if post_campaign starts with SNP-%, then create a new column (x) with field Pso 
> > if post_campaign starts with SNO-% and ref_type = 9, then create a new  column (x) with field OPso
> > if ref_type=6 then create a new column (x) with field Dir

我已经创建了临时表代码,但是需要有关如何在sql查询中插入以上逻辑的帮助

create table temp.Register
Select date(date_time) as date, post_evar10, count(page_event) as Pageviews, concat(post_visid_high, post_visid_low) as UniqueVisitors, ref_type as Source_Traffic, paid_search, post_campaign
from a_hits
where ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by Date, post_evar10, UniqueVisitors, Source_Traffic, paid_search;

预期结果将是一个新列,我将在其中看到

Date    Post_evar10 Pageviews   UniqueVisitors  Source_Traffic  post_campaign   Column X
2/2/2019    event-summary   540 200 3   KNC-%   PS
2/2/2019    event-summary   300 150 3   Null    OS
2/3/2019    event-summary   230 100 9   SNO-%   Opso
2/4/2019    event-summary   290 150 9   SNP-%   Pso
2/5/2019    event-summary   100 300 6   Misc    Dir

1 个答案:

答案 0 :(得分:0)

假设您正在使用newest version of sparksql,则可以使用CASE...WHEN语句

详细了解CASE...WHEN here

create table temp.Register

Select 
    date(date_time) as the_date, 
    post_evar10, 
    count(page_event) as Pageviews, 
    concat(post_visid_high, post_visid_low) as UniqueVisitors, 
    ref_type as Source_Traffic, 
    paid_search, 
    post_campaign,
    CASE
        WHEN post_campaign LIKE 'KNC-%' AND ref_type = 3 THEN 'PS'
        WHEN post_campaign IS NULL AND ref_type = 3 THEN 'OS'
        WHEN post_campaign LIKE 'SNP-%' THEN 'PSO'
        WHEN post_campaign LIKE 'SNO-%' AND ref_type = 9 THEN 'Opso'
        WHEN ref_type = 6 THEN 'Dir'
    ELSE NULL END AS Column_X
from 
    a_hits

where 
    ref_type in (3,6,7,9)
    and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
    and page_event like '0'
    and exclude_hit like '0'
    and hit_source not in (5,7,8,9)

group by 
    the_Date, 
    post_evar10, 
    UniqueVisitors, 
    Source_Traffic, 
    paid_search
;