我有以下代码,该代码将汇总访问者前两个会话的浏览量:
session_index=1
,并且session_index=2
在采样的数据集中。
SELECT SUM(a.page_views)
FROM sessions a
WHERE a.id IN (
SELECT b.id
FROM sessions b
WHERE b.order_id NOTNULL
/*lookup for visitors who have made a purchase*/
)
AND a.id IN (
SELECT c.id
FROM sessions c
WHERE c.session_index = 1
/*lookup for visitors who have logged session_index #1*/
)
AND a.id IN (
SELECT d.id
FROM sessions d
WHERE d.session_index = 2
/*lookup for visitors who have logged session_index #2*/
)
AND a.session_index < 3;
/*makes the SELECT SUM() add records with index #1 and #2.
它具有非常糟糕的效率,因为它分别进行了三次查找比较。有没有更有效的方法来创建将表的三个条件组合为一个的查找表?
答案 0 :(得分:0)
您可以通过此查询获取所有满足条件的ID:
SELECT id
FROM sessions
GROUP BY id
HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)
您可以将其与运算符IN
一起使用来汇总页面浏览量:
SELECT SUM(page_views)
FROM sessions
WHERE session_index < 3
AND id IN (
SELECT id
FROM sessions
GROUP BY id
HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)
)
或者您可以使用窗口功能SUM()
:
SELECT SUM(SUM(CASE WHEN session_index IN (1, 2) THEN page_views END)) OVER ()
FROM sessions
GROUP BY id
HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)
答案 1 :(得分:0)
我建议两个聚合级别:
SELECT SUM(page_views)
FROM (SELECT s.id, SUM(s.page_views) as page_views
FROM sessions s
WHERE s.session_index < 3
GROUP BY s.id
HAVING CCOUNT(s.order_id) > 0 AND -- users have made a purchase
SUM(CASE WHEN s.session_index = 1 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN s.session_index = 2 THEN 1 ELSE 0 END) > 0
) s;
也就是说,具有正确索引并使用EXISTS
的原始版本可能是最快的方法:
SELECT SUM(s.page_views)
FROM sessions s
WHERE a.session_index < 3 AND
EXISTS (SELECT 1
FROM sessions s2
WHERE s2.id = s.id AND s2.order_id NOT NULL
) AND
EXISTS (SELECT 1
FROM sessions s2
WHERE s2.id = s.id AND s2.session_index = 1
) AND
EXISTS (SELECT 1
FROM sessions s2
WHERE s2.id = s.id AND s2.session_index = 2
) ;
您想要的索引位于sessions(id, session_index, order_id)
上。