我创建了一个临时表,在其中插入了字段(具有表示一个属性的多个值的字段,现在我想创建一个逻辑,在其中比较这些属性并创建一个新字段以总结ref_type和post_campaign字段。
我正在尝试根据以下逻辑/条件创建一个新列(x):
> > if post_campaign starts with KNC-% and ref_type = 3 then create a new
column (x) with with field PS
> > if post_campaign is null and ref_type = 3, then create a new column (x) with field OS
> > if post_campaign starts with SNP-%, then create a new column (x) with field Pso
> > if post_campaign starts with SNO-% and ref_type = 9, then create a new column (x) with field OPso
> > if ref_type=6 then create a new column (x) with field Dir
我已经创建了临时表代码,但是需要有关如何在sql查询中插入以上逻辑的帮助
create table temp.Register
Select date(date_time) as date, post_evar10, count(page_event) as Pageviews, concat(post_visid_high, post_visid_low) as UniqueVisitors, ref_type as Source_Traffic, paid_search, post_campaign
from a_hits
where ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by Date, post_evar10, UniqueVisitors, Source_Traffic, paid_search;
预期结果将是一个新列,我将在其中看到
Date Post_evar10 Pageviews UniqueVisitors Source_Traffic post_campaign Column X
2/2/2019 event-summary 540 200 3 KNC-% PS
2/2/2019 event-summary 300 150 3 Null OS
2/3/2019 event-summary 230 100 9 SNO-% Opso
2/4/2019 event-summary 290 150 9 SNP-% Pso
2/5/2019 event-summary 100 300 6 Misc Dir
答案 0 :(得分:0)
假设您正在使用newest version of sparksql
,则可以使用CASE...WHEN
语句
详细了解CASE...WHEN
here
create table temp.Register
Select
date(date_time) as the_date,
post_evar10,
count(page_event) as Pageviews,
concat(post_visid_high, post_visid_low) as UniqueVisitors,
ref_type as Source_Traffic,
paid_search,
post_campaign,
CASE
WHEN post_campaign LIKE 'KNC-%' AND ref_type = 3 THEN 'PS'
WHEN post_campaign IS NULL AND ref_type = 3 THEN 'OS'
WHEN post_campaign LIKE 'SNP-%' THEN 'PSO'
WHEN post_campaign LIKE 'SNO-%' AND ref_type = 9 THEN 'Opso'
WHEN ref_type = 6 THEN 'Dir'
ELSE NULL END AS Column_X
from
a_hits
where
ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by
the_Date,
post_evar10,
UniqueVisitors,
Source_Traffic,
paid_search
;