根据mySQL中的COUNT()值限制GROUP BY

时间:2016-09-05 19:26:58

标签: php mysql sql rdbms

我正在将事件记录到mySQL数据库中,并希望获得前3个事件以进行监控。

我的表格eventlog如下所示:

+----+------------------+---------------------+
| id |    eventname     |      eventdate      |
+----+------------------+---------------------+
|  0 | machine1.started | 2016-09-04 19:22:23 |
|  1 | machine2.reboot  | 2016-09-04 20:23:11 |
|  2 | machine1.stopped | 2016-09-04 20:24:12 |
|  3 | machine1.started | 2016-09-04 20:25:12 |
|  4 | machine1.stopped | 2016-09-04 23:23:16 |
|  5 | machine0.started | 2016-09-04 23:24:00 |
|  6 | machine1.started | 2016-09-04 23:24:16 |
|  7 | machine3.started | 2016-09-04 23:25:00 |
|  8 | machine4.started | 2016-09-04 23:26:00 |
|  9 | cluster.alive    | 2016-09-04 23:30:00 |
| 10 | cluster.alive    | 2016-09-05 11:30:00 |
+----+------------------+---------------------+

查询最终应该返回以下内容,持有

  • 最常出现的前3个事件(基于由mySQL的eventcount函数生成的列COUNT()),按其eventname
  • 分组
  • 只有2行,其中eventcount = 1,但仅当1位于前3 eventcounts范围内时(因为有很多事件发生在 一次,因此会超载我的前端)

基于上表所需结果的示例:

+------------+------------------+
| eventcount |    eventname     |
+------------+------------------+
|          3 | machine1.started |
|          2 | machine1.stopped |
|          2 | cluster.alive    |
|          1 | machine0.started |
|          1 | machine2.started |
+------------+------------------+

请注意,我不需要仅返回3行,而是需要3行eventcount s的行。

我通过弄乱下面的查询字符串进行了大量的实验,包括多个选择和可疑的CASE ... WHEN条件,但是无法按照我需要的方式使其工作。

SELECT COUNT(id) AS 'eventcount', eventname
FROM eventlog
GROUP BY eventname
ORDER BY eventcount DESC;

以高效的方式获得理想结果的最佳方法是什么?

5 个答案:

答案 0 :(得分:2)

MySQL中的这些类型的条件很痛苦。一种方法使用变量。这是一个没有的方法:

SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
      FROM eventlog el
      GROUP BY el.eventname
     ) el JOIN
     (SELECT cnt
      FROM (SELECT DISTINCT COUNT(el.id) as cnt
            FROM eventlog el
            GROUP BY el.eventname
           ) el
      ORDER BY cnt DESC
      LIMIT 3
     ) ell
     ON ell.cnt = el.eventcount
ORDER BY el.eventcount DESC;

编辑:

使用变量的解决方案如下所示,并且对于1的计数包括2的限制:

SELECT *
FROM (SELECT e.*,
             (@rn1 := if(@c1 = eventcount, @rn1 + 1,
                         if(@c1 := eventcount, 1, 1)
                        )
             ) as rn
      FROM (SELECT e.*,
                   (@rn := if(@c = eventcount, @rn,
                              if(@c := eventcount, @rn + 1, @rn + 1)
                             )
                   ) as rank
            FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
                  FROM eventlog el
                  GROUP BY el.eventname
                 ) e CROSS JOIN
                 (SELECT @c := 0, @rn := 0) params
            ORDER BY eventcount DESC
           ) e CROSS JOIN
           (SELECT @c1 := 0, @rn1 := 0) params
      ORDER BY eventcount DESC
     ) e
WHERE rank <= 3 AND
      (eventcount > 1 OR rn <= 2);

最里面的计数枚举计数。第二个计数在计数内。实际上,这两者可能合并为一个子查询,但要小心。

答案 1 :(得分:2)

这是使用变量进行此操作的一种方法 SQL小提琴:http://sqlfiddle.com/#!9/b3458b/16

SELECT
  t2.eventcount
  ,t2.eventname
FROM
(
  SELECT
      t.eventname
      ,t.eventcount
      ,@Rank:=IF(@PrevCount=t.eventcount,@Rank,@Rank+1) Rank
      ,@CountRownum:=IF(@PrevCount=t.eventcount,@CountRowNum + 1,1) CountRowNum
      ,@PrevCount:= t.eventcount
    FROM
      (
        SELECT
          l.eventname
          ,COUNT(*) as eventcount
        FROM
          eventlog l
        GROUP BY
          l.eventname
        ORDER BY
          COUNT(*) DESC
      ) t
      CROSS JOIN (SELECT @Rank:=0, @CountRowNum:=0, @PrevCount:=-1) var
    ORDER BY
      t.eventcount DESC
) t2
WHERE
  t2.Rank < 4
  AND NOT (t2.eventcount = 1 AND t2.CountRowNum > 2)

答案 2 :(得分:0)

这应该可以重构一下,但它现在返回正确的答案:

SELECT eventcount, eventname
FROM
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
      FROM eventlog el
      GROUP BY el.eventname
     ) el JOIN
     (SELECT counts
      FROM (SELECT DISTINCT COUNT(el.id) as counts
            FROM eventlog el
            GROUP BY el.eventname
           ) el
      ORDER BY counts DESC
      LIMIT 3
     ) el2
     ON el2.counts = el.eventcount
     WHERE el.eventcount != 1
UNION ALL
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
      FROM eventlog el
      GROUP BY el.eventname
     ) el JOIN
     (SELECT counts
      FROM (SELECT DISTINCT COUNT(el.id) as counts
            FROM eventlog el
            GROUP BY el.eventname
           ) el
      ORDER BY counts DESC
      LIMIT 3
     ) el2
     ON el2.counts = el.eventcount AND el2.counts = 1
     LIMIT 2)) tmp
ORDER BY tmp.eventcount DESC;

SQL小提琴:http://sqlfiddle.com/#!9/10f0d/92

答案 3 :(得分:0)

如果你可以使用临时表..

预先计算事件计数并将结果存储在临时表中:

create temporary table tmp_eventcounts
  select eventname, count(1) as eventcount
  from eventlog
  group by eventname
  order by eventcount desc
;

tmp_eventcounts的内容:

|        eventname | eventcount |
|------------------|------------|
| machine1.started |          3 |
| machine1.stopped |          2 |
|    cluster.alive |          2 |
| machine3.started |          1 |
|  machine2.reboot |          1 |
| machine4.started |          1 |
| machine0.started |          1 |

选择前3个事件计数并将它们存储在另一个临时表中:

create temporary table tmp_top3counts
  select distinct eventcount
  from tmp_eventcounts
  order by eventcount desc
  limit 3
;

tmp_top3counts的内容:

| eventcount |
|------------|
|          3 |
|          2 |
|          1 |

现在选择前3个事件数量但eventcount&gt;的所有事件名称1。 同时最多选择两个事件名称,前3个事件数量,但eventcount = 1。 使用UNION组合两个结果:

select eventcount, eventname
from tmp_top3counts
join tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
  select eventcount, eventname
  from tmp_top3counts
  join tmp_eventcounts using(eventcount)
  where eventcount = 1
  limit 2
)
order by eventcount desc;

结果:

| eventcount |        eventname |
|------------|------------------|
|          3 | machine1.started |
|          2 | machine1.stopped |
|          2 |    cluster.alive |
|          1 |  machine2.reboot |
|          1 | machine3.started |

http://sqlfiddle.com/#!9/b332df/1

如果你不能使用临时表,你可以用它们的定义替换它们的出现,创建一个高度不可读但有效的查询:

select eventcount, eventname
from (
  select distinct eventcount
  from (
    select eventname, count(1) as eventcount
    from eventlog
    group by eventname
  ) tmp_eventcounts
  order by eventcount desc
  limit 3  
) tmp_top3counts
join (
  select eventname, count(1) as eventcount
  from eventlog
  group by eventname
) tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
  select eventcount, eventname
  from (
    select distinct eventcount
    from (
      select eventname, count(1) as eventcount
      from eventlog
      group by eventname
    ) tmp_eventcounts
    order by eventcount desc
    limit 3
  ) tmp_top3counts
  join (
    select eventname, count(1) as eventcount
    from eventlog
    group by eventname
  ) tmp_eventcounts using(eventcount)
  where eventcount = 1
  limit 2
)
order by eventcount desc;

http://sqlfiddle.com/#!9/2eea6/4; - )

虽然这可能看起来很疯狂,但可以在PHP中轻松创建:

$tmp_eventcounts = "
    select eventname, count(1) as eventcount
    from eventlog
    group by eventname
";

$tmp_top3counts = "
    select distinct eventcount
    from ( {$tmp_eventcounts} ) tmp_eventcounts
    order by eventcount desc
    limit 3
";

$sql = "
    select eventcount, eventname
    from ( {$tmp_top3counts} )  tmp_top3counts
    join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
    where eventcount > 1
    union all (
      select eventcount, eventname
      from ( {$tmp_top3counts} )  tmp_top3counts
      join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
      where eventcount = 1
      limit 2
    )
    order by eventcount desc
";

注意:看起来MySQL需要一次又一次地执行相同的子查询。但它应该能够缓存结果并重用它们。

答案 4 :(得分:-1)

你可以尝试一下:

SELECT count(eventname), eventname FROM table
group by eventname
HAVING(count(eventname)) > 1
order by count(eventname) DESC
limit 3