如何在MySQL中增加列值的出现次数

时间:2019-12-04 16:48:08

标签: mysql sql mysql-5.6

我有以下列名:

  • customer_email
  • increment_id
  • other_id(伪名称)
  • created_at

increment_idother_id将是唯一的,customer_email将具有重复项。返回结果后,我想知道电子​​邮件的出现次数。

对于每一行,我想知道到目前为止,customer_email值显示了多少次。 order by字段的末尾将有一个created_at子句,我计划还添加where occurrences < 2

的where子句

我要查询500万行以上,但是性能并不是太重要,因为我将其作为生产环境中的只读副本数据库的报告来运行。在我的用例中,我将牺牲性能来保证鲁棒性。

| customer_email | incremenet_id | other_id | created_at          | occurances <- I want this |
|----------------|---------------|----------|---------------------|---------------------------|
| joe@test.com   | 1             | 81       | 2019-11-00 00:00:00 | 1                         |
| sue@test.com   | 2             | 82       | 2019-11-00 00:01:00 | 1                         |
| bill@test.com  | 3             | 83       | 2019-11-00 00:02:00 | 1                         |
| joe@test.com   | 4             | 84       | 2019-11-00 00:03:00 | 2                         |
| mike@test.com  | 5             | 85       | 2019-11-00 00:04:00 | 1                         |
| sue@test.com   | 6             | 86       | 2019-11-00 00:05:00 | 2                         |
| joe@test.com   | 7             | 87       | 2019-11-00 00:06:00 | 3                         |

2 个答案:

答案 0 :(得分:1)

如果您正在运行MySQL 8.0,则可以进行一次窗口计数:

select 
    t.*,
    count(*) over(partition by customer_email order by created_at) occurences 
from mytable t

查询的末尾不需要order by子句才能起作用(但是如果要订购结果,则需要一个子句)。

如果需要过滤窗口计数的结果,则需要附加级别,因为在查询的where子句中不能使用窗口函数:

select *
from (
    select 
        t.*,
        count(*) over(partition by customer_email order by created_at) occurences 
    from mytable t
) t
where occurences < 2

答案 1 :(得分:1)

您可以在早期版本的MySQL中使用变量:

select t.*,
       (@rn := if(@ce = customer_email, @rn + 1,
                  if(@ce := customer_email, 1, 1)
                 )
       ) as occurrences
from (select t.*
      from t
      order by customer_email, created_at
     ) t cross join
     (select @ce := '', @rn := 0) params;

在MyQL 8+中,我建议使用row_number()

select t.*,
       row_number() over (partition by customer_email order by created_at) as occurrences
from t;