SQL IN()运算符可以更有效地堆叠吗?

时间:2020-08-04 06:46:21

标签: sql sqlite

我有以下代码,该代码将汇总访问者前两个会话的浏览量:

  1. 随时下订单,并且
  2. 已登录session_index=1,并且
  3. 已登录session_index=2

在采样的数据集中。

SELECT SUM(a.page_views)
FROM sessions a
WHERE a.id IN (
    SELECT b.id 
    FROM sessions b 
    WHERE b.order_id NOTNULL
        /*lookup for visitors who have made a purchase*/
)
AND a.id IN (
    SELECT c.id 
    FROM sessions c 
    WHERE c.session_index = 1
        /*lookup for visitors who have logged session_index #1*/
)
AND a.id IN (
    SELECT d.id
    FROM sessions d
    WHERE d.session_index = 2
        /*lookup for visitors who have logged session_index #2*/
)
AND a.session_index < 3;
    /*makes the SELECT SUM() add records with index #1 and #2.

它具有非常糟糕的效率,因为它分别进行了三次查找比较。有没有更有效的方法来创建将表的三个条件组合为一个的查找表?

2 个答案:

答案 0 :(得分:0)

您可以通过此查询获取所有满足条件的ID:

SELECT id
FROM sessions
GROUP BY id
HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)

您可以将其与运算符IN一起使用来汇总页面浏览量:

SELECT SUM(page_views)
FROM sessions
WHERE session_index < 3
AND id IN (
    SELECT id
    FROM sessions
    GROUP BY id
    HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)
)

或者您可以使用窗口功能SUM()

SELECT SUM(SUM(CASE WHEN session_index IN (1, 2) THEN page_views END)) OVER ()
FROM sessions
GROUP BY id
HAVING COUNT(order_id) AND SUM(session_index = 1) AND SUM(session_index = 2)

答案 1 :(得分:0)

我建议两个聚合级别:

SELECT SUM(page_views)
FROM (SELECT s.id, SUM(s.page_views) as page_views
      FROM sessions s
      WHERE s.session_index < 3
      GROUP BY s.id
      HAVING CCOUNT(s.order_id) > 0 AND  -- users have made a purchase
             SUM(CASE WHEN s.session_index = 1 THEN 1 ELSE 0 END) > 0 AND
             SUM(CASE WHEN s.session_index = 2 THEN 1 ELSE 0 END) > 0
    ) s;

也就是说,具有正确索引并使用EXISTS的原始版本可能是最快的方法:

SELECT SUM(s.page_views)
FROM sessions s
WHERE a.session_index < 3 AND
      EXISTS (SELECT 1
              FROM sessions s2 
              WHERE s2.id = s.id AND s2.order_id NOT NULL
             ) AND
      EXISTS (SELECT 1
              FROM sessions s2 
              WHERE s2.id = s.id AND s2.session_index = 1
             ) AND
      EXISTS (SELECT 1
              FROM sessions s2 
              WHERE s2.id = s.id AND s2.session_index = 2
             ) ;

您想要的索引位于sessions(id, session_index, order_id)上。