Question

我正在自己创建一个小项目，它有点像一个网站监控工具。我有一个代理运行阅读网页，它对网站状态代码，内容检查和响应时间作出反应。

表格如下所示。

CREATE TABLE `data` (
  `id` int(11) NOT NULL,
  `check_id` int(11) NOT NULL,
  `content_string_used` varchar(20) NOT NULL,
  `content_check` enum('good','bad') NOT NULL,
  `http_code` int(11) NOT NULL,
  `total_time` varchar(5) NOT NULL,
  `namelookup_time` varchar(5) NOT NULL,
  `connect_time` varchar(5) NOT NULL,
  `pretransfer_time` varchar(5) NOT NULL,
  `starttransfer_time` varchar(5) NOT NULL,
  `url` varchar(50) NOT NULL,
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

我要做的是选择特定支票的所有记录，例如

SELECT * FROM `data` WHERE  `check_id` = 173;

现在这里变得棘手而且生病尽力解释。行本身有一些重要的列。它的content_check和http_code。

我想要做的是将这两行作为分隔符对所有行进行分组，然后选择从第一个好行到最后一行的开始时间。

示例...

SELECT id, check_id, content_check, http_code, time from data WHERE `check_id` = 173;

结果

(15, 173, 'bad', 0, '2018-03-11 15:43:11'),
(23, 173,'bad', 0, '2018-03-11 15:44:11'),
(35, 173,'good', 0, '2018-03-11 15:45:11'),
(49, 173,'good', 0, '2018-03-11 15:46:11'),
(67, 173,'bad', 0, '2018-03-11 15:47:11'),
(85, 173,'bad', 0, '2018-03-11 15:48:11'),
(105, 173,'bad', 0, '2018-03-11 15:49:11'),
(125, 173,'good', 0, '2018-03-11 15:50:11'),
(145, 173,'bad', 0, '2018-03-11 15:51:11'),
(165, 173,'bad', 0, '2018-03-11 15:52:11');

id喜欢将这个返回到类似的查询，基本上将时间间隔的好/坏总结为某种分隔符。

(15, 'bad', 0, '2018-03-11 15:43:11', '2018-03-11 15:44:11'),
(35, 'good', 0, '2018-03-11 15:45:11', '2018-03-11 15:46:11'),
(67, 'bad', 0, '2018-03-11 15:47:11', 2018-03-11 15:49:11),
(125, 'good', 0, '2018-03-11 15:50:11', '2018-03-11 15:50:11'),
(145, 'bad', 0, '2018-03-11 15:51:11','2018-03-11 15:52:11'),

请帮助或指出我正确的方向。

Answer 1

对于这样的事情有用的一个技巧是使用一对变量来跟踪最近的记录check_id和http_code，再加上第三个变量来表示其值的组号仅在check_id或http_code与前一记录的记录不同的记录上增加。例如，给出以下设置：

CREATE TABLE `data` (
    `id` int(11) NOT NULL,
    `check_id` int(11) NOT NULL,
    `content_check` enum('good','bad') NOT NULL,
    `http_code` int(11) NOT NULL,
    `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

insert into `data`
    (`id`, `check_id`, `content_check`, `http_code`, `time`)
values
    (15, 173, 'bad', 0, '2018-03-11 15:43:11'),
    (23, 173, 'bad', 0, '2018-03-11 15:44:11'),
    (35, 173, 'good', 0, '2018-03-11 15:45:11'),
    (49, 173, 'good', 0, '2018-03-11 15:46:11'),
    (67, 173, 'bad', 0, '2018-03-11 15:47:11'),
    (85, 173, 'bad', 0, '2018-03-11 15:48:11'),
    (105, 173, 'bad', 0, '2018-03-11 15:49:11'),
    (125, 173, 'good', 0, '2018-03-11 15:50:11'),
    (145, 173, 'bad', 0, '2018-03-11 15:51:11'),
    (165, 173, 'bad', 0, '2018-03-11 15:52:11');

set @lastContentCheck = '';
set @lastHttpCode = '';

我可以编写以下查询，如上所述分配组号：

select
    `id`,
    `check_id`,
    @groupNumber :=
        case
            when @lastContentCheck = `content_check` and @lastHttpCode = `http_code` then @groupNumber
            else @groupNumber + 1
        end as `GroupNumber`,
    @lastContentCheck := `content_check` as `content_check`,
    @lastHttpCode := `http_code` as `http_code`,
    `time`
from
    `data`,
    (select @groupNumber := 0) as `gn`
where
    `check_id` = 173
order by
    `time`

此查询的输出为：

id   check_id  GroupNumber  content_check  http_code  time
15   173       1            bad            0          2018-03-11 15:43:11
23   173       1            bad            0          2018-03-11 15:44:11
35   173       2            good           0          2018-03-11 15:45:11
49   173       2            good           0          2018-03-11 15:46:11
67   173       3            bad            0          2018-03-11 15:47:11
85   173       3            bad            0          2018-03-11 15:48:11
105  173       3            bad            0          2018-03-11 15:49:11
125  173       4            good           0          2018-03-11 15:50:11
145  173       5            bad            0          2018-03-11 15:51:11
165  173       5            bad            0          2018-03-11 15:52:11

此时，您可以通过简单地将上一个查询包裹起来，通过GroupNumber对其数据进行分组，从而获得所需的结果集。所以整件事情看起来像这样：

select
    min(`id`) as `id`,
    `check_id`,
    `content_check`,
    `http_code`,
    min(`time`) as `EarliestTime`,
    max(`time`) as `LatestTime`
from
    (
        select
            `id`,
            `check_id`,
            @groupNumber :=
                case
                    when @lastContentCheck = `content_check` and @lastHttpCode = `http_code` then @groupNumber
                    else @groupNumber + 1
                end as `GroupNumber`,
            @lastContentCheck := `content_check` as `content_check`,
            @lastHttpCode := `http_code` as `http_code`,
            `time`
        from
            `data`,
            (select @groupNumber := 0) as `gn`
        where
            `check_id` = 173
        order by
            `time`
    ) as `GroupedData`
group by
    `check_id`,
    `GroupNumber`,
    `content_check`,
    `http_code`
order by
    `GroupNumber`;

结果如你所愿：

id   check_id  content_check  http_code  EarliestTime         LatestTime
15   173       bad            0          2018-03-11 15:43:11  2018-03-11 15:44:11
35   173       good           0          2018-03-11 15:45:11  2018-03-11 15:46:11
67   173       bad            0          2018-03-11 15:47:11  2018-03-11 15:49:11
125  173       good           0          2018-03-11 15:50:11  2018-03-11 15:50:11
145  173       bad            0          2018-03-11 15:51:11  2018-03-11 15:52:11

Demo on sqltest.net

Answer 2

考虑到我不是MySQL的专家，我已经读过它不允许窗口函数，所以我试图在子选择中模拟它们。我不知道它是否允许公用表表达式，所以我尝试创建一个对所有数据库引擎都有效的SELECT语句。试试这个：

create table table_test (id int, check_id int, content_check varchar(10), http_code int, [time] datetime)

insert into table_test values
(15, 173, 'bad', 0, '2018-03-11 15:43:11'),
(23, 173,'bad', 0, '2018-03-11 15:44:11'),
(35, 173,'good', 0, '2018-03-11 15:45:11'),
(49, 173,'good', 0, '2018-03-11 15:46:11'),
(67, 173,'bad', 0, '2018-03-11 15:47:11'),
(85, 173,'bad', 0, '2018-03-11 15:48:11'),
(105, 173,'bad', 0, '2018-03-11 15:49:11'),
(125, 173,'good', 0, '2018-03-11 15:50:11'),
(145, 173,'bad', 0, '2018-03-11 15:51:11'),
(165, 173,'bad', 0, '2018-03-11 15:52:11');

select t1.id,t3.content_check, t3.http_code ,t1.[time],t2.[time]
from (
  select count(*) as rid, t1.id, t1.[time]
  from table_test t1
    join table_test t2 
      on t1.[time]>=t2.[time]
  group by t1.id, t1.[time]
  ) t1
  left join (
    select count(*) as rid, t1.id, t1.[time]
    from table_test t1
      join table_test t2 
        on t1.[time]>=t2.[time]
    group by t1.id, t1.[time]
    ) t2
    on t1.rid=t2.rid-1
  join table_test t3
    on t1.id=t3.id
where t1.rid%2<>0

SQL：按时间间隔列分组？

2 个答案: