我有以下列名:
increment_id
和other_id
将是唯一的,customer_email
将具有重复项。返回结果后,我想知道电子邮件的出现次数。
对于每一行,我想知道到目前为止,customer_email
值显示了多少次。 order by
字段的末尾将有一个created_at
子句,我计划还添加where occurrences < 2
我要查询500万行以上,但是性能并不是太重要,因为我将其作为生产环境中的只读副本数据库的报告来运行。在我的用例中,我将牺牲性能来保证鲁棒性。
| customer_email | incremenet_id | other_id | created_at | occurances <- I want this |
|----------------|---------------|----------|---------------------|---------------------------|
| joe@test.com | 1 | 81 | 2019-11-00 00:00:00 | 1 |
| sue@test.com | 2 | 82 | 2019-11-00 00:01:00 | 1 |
| bill@test.com | 3 | 83 | 2019-11-00 00:02:00 | 1 |
| joe@test.com | 4 | 84 | 2019-11-00 00:03:00 | 2 |
| mike@test.com | 5 | 85 | 2019-11-00 00:04:00 | 1 |
| sue@test.com | 6 | 86 | 2019-11-00 00:05:00 | 2 |
| joe@test.com | 7 | 87 | 2019-11-00 00:06:00 | 3 |
答案 0 :(得分:1)
如果您正在运行MySQL 8.0,则可以进行一次窗口计数:
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
查询的末尾不需要order by
子句才能起作用(但是如果要订购结果,则需要一个子句)。
如果需要过滤窗口计数的结果,则需要附加级别,因为在查询的where
子句中不能使用窗口函数:
select *
from (
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
) t
where occurences < 2
答案 1 :(得分:1)
您可以在早期版本的MySQL中使用变量:
select t.*,
(@rn := if(@ce = customer_email, @rn + 1,
if(@ce := customer_email, 1, 1)
)
) as occurrences
from (select t.*
from t
order by customer_email, created_at
) t cross join
(select @ce := '', @rn := 0) params;
在MyQL 8+中,我建议使用row_number()
:
select t.*,
row_number() over (partition by customer_email order by created_at) as occurrences
from t;