SQL查找具有多行条件并创建唯一%公式的ID

时间:2015-07-26 02:01:24

标签: sql teradata

所以我有以下问题:

  

该表“访问”了包含三列的所有访问列表:Visit_ID,   Visitor_ID,时间戳和页面名称:   Visit_ID Visitor_ID时间戳Page_Name

     

表'first_visitors'是访客列表及其首次访问   有三列:Visitor_ID,First_Visit_Date和Channel

     

A1。平均访问次数和最后七次访问每页的访问者数   天

     

A2。访问者访问过全部三个访问者的访问者和频道   '主页','产品页'和'确认页'至少一次   过去七天

     

A3。访问“确认页面”的SEM访问者百分比   他们第一次访问三十天。

我对将时间戳转换为日期并使用日期和日期-7获得一周感到担忧。这是正确的方法吗? (A1)

同时获取访问过所有3个页面的visitor_ids也很难。我尝试使用having子句但不确定这是否正确。 (A2)

最后将一个聚合列除以另一个聚合列的百分比很难,我不确定这是否是正确的方法呢? (A3)

我的代码如下。任何建议都非常感谢。

--A1.   Average number of visits, and visitors per page over the last seven days
select
page_name,
count(visit_ID) as average_visits, 
count(DISTINCT visitor_ID) as average_visitors
from visits 
where cast(timestamp as date) between date and date-7 
group by page_name;

--A2.   Visitor_id and Channel for visitors that have visited all three of ‘home page’, ‘product page’, and ‘confirmation page’ at least once in the last seven days
select
a.visitor_id,
b.channel
from visits a
join first_visitors b on a.visitor_id = b.visitor_id
where cast(a.timestamp as date) between date and date-7 
and a.page_name in ('home page','product page','confirmation page') 
group by a.visitor_id
having count(distinct a.page_name) >= 3;

--A3.   Percent of SEM visitors that visit the ‘confirmation page’ within thirty days of their first visit.
select 
count(*) as visited_confirmation_page from
(select distinct a.visitor_id
    from first_visitors a
    join visits b on b.visitor_id = a.visitor_id
    where channel = 'SEM'
    and b.page_name in 'confirmation_page'
    and cast(b.timestamp as date) between cast(a.first_visit_timestamp as date) and cast(a.first_visit_timestamp as date)+30) 
count(*) as all_SEM_visits
(select distinct a.visitor_id
    from first_visitors a
    where channel = 'SEM')
((visited_confirmation_page / all_SEM_visits) * 100.00) as %_of_SEM_confirmations;

1 个答案:

答案 0 :(得分:0)

A1没关系,你只需做一个小的语法变体,而不是DATE更好地使用标准SQL&CURSENT_DATE:

where cast(timestamp as date) between CURRENT_DATE and CURRENT_DATE-7 

A2的逻辑正确,但您需要将b.channel添加到group by以避免语法错误。你应该尝试加入加入聚合:

select
a.visitor_id,
b.channel
from first_visitors b
join
 (
   select visitor_id
   from visits 
   where cast(timestamp as date) between date and date-7 
   and page_name in ('home page','product page','confirmation page') 
   group by visitor_id
   having count(distinct a.page_name) >= 3
 ) as a
on a.visitor_id = b.visitor_id

您的A3语法将失败,但您已关闭:

select
 100.00 * -- multiply first, then divide
 (select count(distinct a.visitor_id) as visited_confirmation_page 
    from first_visitors a
    join visits b on b.visitor_id = a.visitor_id
    where channel = 'SEM'
    and b.page_name in 'confirmation_page'
    and cast(b.timestamp as date) between cast(a.first_visit_timestamp as date) 
    and cast(a.first_visit_timestamp as date)+30
 ) / 
 (select count(distinct a.visitor_id) as all_SEM_visits -- DISTINCT probably not needed
    from first_visitors a
    where channel = 'SEM'
 ) as "%_of_SEM_confirmations"