所以我有以下问题:
该表“访问”了包含三列的所有访问列表:Visit_ID, Visitor_ID,时间戳和页面名称: Visit_ID Visitor_ID时间戳Page_Name
表'first_visitors'是访客列表及其首次访问 有三列:Visitor_ID,First_Visit_Date和Channel
A1。平均访问次数和最后七次访问每页的访问者数 天
A2。访问者访问过全部三个访问者的访问者和频道 '主页','产品页'和'确认页'至少一次 过去七天
A3。访问“确认页面”的SEM访问者百分比 他们第一次访问三十天。
我对将时间戳转换为日期并使用日期和日期-7获得一周感到担忧。这是正确的方法吗? (A1)
同时获取访问过所有3个页面的visitor_ids也很难。我尝试使用having子句但不确定这是否正确。 (A2)
最后将一个聚合列除以另一个聚合列的百分比很难,我不确定这是否是正确的方法呢? (A3)
我的代码如下。任何建议都非常感谢。
--A1. Average number of visits, and visitors per page over the last seven days
select
page_name,
count(visit_ID) as average_visits,
count(DISTINCT visitor_ID) as average_visitors
from visits
where cast(timestamp as date) between date and date-7
group by page_name;
--A2. Visitor_id and Channel for visitors that have visited all three of ‘home page’, ‘product page’, and ‘confirmation page’ at least once in the last seven days
select
a.visitor_id,
b.channel
from visits a
join first_visitors b on a.visitor_id = b.visitor_id
where cast(a.timestamp as date) between date and date-7
and a.page_name in ('home page','product page','confirmation page')
group by a.visitor_id
having count(distinct a.page_name) >= 3;
--A3. Percent of SEM visitors that visit the ‘confirmation page’ within thirty days of their first visit.
select
count(*) as visited_confirmation_page from
(select distinct a.visitor_id
from first_visitors a
join visits b on b.visitor_id = a.visitor_id
where channel = 'SEM'
and b.page_name in 'confirmation_page'
and cast(b.timestamp as date) between cast(a.first_visit_timestamp as date) and cast(a.first_visit_timestamp as date)+30)
count(*) as all_SEM_visits
(select distinct a.visitor_id
from first_visitors a
where channel = 'SEM')
((visited_confirmation_page / all_SEM_visits) * 100.00) as %_of_SEM_confirmations;
答案 0 :(得分:0)
A1没关系,你只需做一个小的语法变体,而不是DATE更好地使用标准SQL&CURSENT_DATE:
where cast(timestamp as date) between CURRENT_DATE and CURRENT_DATE-7
A2的逻辑正确,但您需要将b.channel
添加到group by
以避免语法错误。你应该尝试加入加入聚合:
select
a.visitor_id,
b.channel
from first_visitors b
join
(
select visitor_id
from visits
where cast(timestamp as date) between date and date-7
and page_name in ('home page','product page','confirmation page')
group by visitor_id
having count(distinct a.page_name) >= 3
) as a
on a.visitor_id = b.visitor_id
您的A3语法将失败,但您已关闭:
select
100.00 * -- multiply first, then divide
(select count(distinct a.visitor_id) as visited_confirmation_page
from first_visitors a
join visits b on b.visitor_id = a.visitor_id
where channel = 'SEM'
and b.page_name in 'confirmation_page'
and cast(b.timestamp as date) between cast(a.first_visit_timestamp as date)
and cast(a.first_visit_timestamp as date)+30
) /
(select count(distinct a.visitor_id) as all_SEM_visits -- DISTINCT probably not needed
from first_visitors a
where channel = 'SEM'
) as "%_of_SEM_confirmations"