我创建了这个SQL,以便找到尚未订购X天的客户。
它返回一个结果集,所以这篇文章主要是为了得到第二个意见,以及可能的优化。
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
(SELECT COUNT(*)
FROM orders o2
WHERE o2.user_id=o.user_id
AND o2.order_status=1) AS order_count,
(SELECT o4.order_created
FROM orders o4
WHERE o4.user_id=o.user_id
AND o4.order_status=1
ORDER BY o4.order_created DESC LIMIT 1) AS last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND o.user_id NOT IN
(SELECT o3.user_id
FROM orders o3
WHERE o3.user_id=o.user_id
AND o3.order_status=1
AND DATE(o3.order_created) > "2013-12-14")
你们可以找到这个SQL的任何潜在问题吗?日期是动态插入的。
我投入生产的最终SQL基本上只包括o.order_id,i.identity_id和o.order_count - 这个order_count需要是正确的。其他选定的字段和'last_order'子查询将不包括在内,它仅用于测试。
这应该会给我一个在特定日期有最后订单的用户列表,并且是一个简报订阅者。我特别怀疑WHERE子句中的NOT IN部分的正确性以及order_count子查询。
答案 0 :(得分:2)
有几个问题:
一个。在可索引列上使用函数
您正在通过将DATE(order_created)
与某个常量进行比较来搜索订单。这是一个糟糕的主意,因为a)为每一行(CPU)执行DATE()
函数,b)数据库不能使用列上的索引(假设存在一个)
B中。使用WHERE ID NOT IN (...)
使用NOT IN (...)
几乎总是一个坏主意,因为优化器通常会遇到此构造的问题,并且经常会导致计划错误。您几乎总是可以将其表达为具有WHERE
条件的外部联接,该条件使用加入列的IS NULL
条件过滤未命中(并添加不需要DISTINCT
的附带好处,因为只有一个小姐回来了)
℃。离开连接过滤大部分行太晚了
越早越好,可以通过不加入来掩盖行。您可以通过加入不太可能匹配联接表列表中较早的表的方式,并将非键条件放入join而不是where子句来尽可能早地排除行来实现此目的。无论如何一些优化器,但我经常发现它们没有
d。避免像瘟疫这样的相关子查询!
您有几个相关的子查询 - 为主表的每行执行的子查询。这真是一个非常糟糕的主意。有时候,优化器有时可以将它们组合成一个连接,但为什么依赖(希望)呢。大多数相关的子查询可以表示为连接;你的例子也不例外。
考虑到上述情况,有一些具体的变化:
DATE(order_created) = "2013-12-14"
应写为order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
此查询应该是您想要的:
SELECT
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
count(o2.user_id) AS order_count,
max(o2.order_created) AS last_order
FROM orders o
LEFT JOIN orders o2 ON o2.user_id = o.user_id AND o2.order_status=1
LEFT JOIN orders o3 ON o3.user_id = o.user_id
AND o3.order_status=1
AND o3.order_created >= "2013-12-15 00:00:00"
JOIN user_identities ui ON o.user_id=ui.user_id
JOIN identities i ON ui.identity_id=i.identity_id AND i.identity_email != ''
JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE o.order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
AND o.order_status=1
AND o3.order_created IS NULL -- This gets only missed joins on o3
GROUP BY
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email;
最后一行是使用NOT IN (...)
LEFT JOIN
达到相同的效果
免责声明:未经测试。
答案 1 :(得分:0)
无法对结果发表评论,因为您尚未发布任何表声明或示例数据,但您的查询有3个相关的子查询,这可能会使其表现不佳(好的,其中一个是针对last_order而且是仅用于测试)。
消除相关的子查询并用连接替换它们会产生如下结果: -
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
Sub1.order_count,
Sub2.last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
LEFT OUTER JOIN
(
SELECT user_id, COUNT(*) AS order_count
FROM orders
WHERE order_status=1
GROUP BY user_id
) Sub1
ON o.user_id = Sub1.user_id
LEFT OUTER JOIN
(
SELECT user_id, MAX(order_created) as last_order
FROM orders
WHERE order_status=1
GROUP BY user_id
) AS Sub2
ON o.user_id = Sub2.user_id
LEFT OUTER JOIN
(
SELECT DISTINCT user_id
FROM orders
WHERE order_status=1
AND DATE(order_created) > "2013-12-14"
) Sub3
ON o.user_id = Sub3.user_id
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND Sub3.user_id IS NULL