我正在尝试从一组客户端和服务器之间的连接中查询某些大数据中的某些信息。以下是表格中相关列的示例数据(connection_stats):
+---------------------------------------------------------+
| timestamp | client_id | server_id | status |
+---------------------------------------------------------+
| 2013-07-06 10:40:30 | 100 | 800 | SUCCESS |
+---------------------------------------------------------+
| 2013-07-06 10:40:50 | 101 | 801 | FAILED |
+---------------------------------------------------------+
| 2013-07-06 10:42:00 | 100 | 800 | ABORTED |
+---------------------------------------------------------+
| 2013-07-06 10:43:30 | 100 | 801 | SUCCESS |
+---------------------------------------------------------+
| 2013-07-06 10:56:00 | 100 | 800 | FAILED |
+---------------------------------------------------------+
在此表中,我尝试按连接状态“FAILED”查询连接状态“ABORTED”的所有实例(按时间戳顺序),每个client_id,server_id对 。我想获得两条记录 - 状态为“已淘汰”且状态为“失败”的记录。在上面的数据样本中有一个这样的情况 - 对于100,800对,在“ABORTED”之后立即出现“FAILED”状态。
我是SQL和数据库的新手,我完全迷失在这一点上。任何有关如何处理此问题的建议都将非常感激。
数据库是mysql。
答案 0 :(得分:2)
不可否认,这不是很优雅,但是我可以直接使用没有CTE或排名功能的MySQL工作,而且没有保证唯一的行ID可以使用。
SELECT aborted.* FROM Table1 aborted JOIN Table1 failed
ON aborted.server_id = failed.server_id
AND aborted.client_id = failed.client_id
AND aborted.timestamp < failed.timestamp
LEFT JOIN Table1 filler
ON filler.server_id = aborted.server_id
AND filler.client_id = aborted.client_id
AND aborted.timestamp < filler.timestamp
AND filler.timestamp < failed.timestamp
WHERE filler.timestamp IS NULL
AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'
UNION
SELECT failed.* FROM Table1 aborted JOIN Table1 failed
ON aborted.server_id = failed.server_id
AND aborted.client_id = failed.client_id
AND aborted.timestamp < failed.timestamp
LEFT JOIN Table1 filler
ON filler.server_id = aborted.server_id
AND filler.client_id = aborted.client_id
AND aborted.timestamp < filler.timestamp
AND filler.timestamp < failed.timestamp
WHERE filler.timestamp IS NULL
AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'
如果您对只有一行记录了两个记录感到满意,您只需从中止/失败中选择您想要的字段并跳过整个联合的后半部分(即查询将被减半)
由于我在UNION
上收到了评论,所以使用JOIN
也是一样的,假设每个客户端/服务器组合的时间戳是唯一的(这里唯一的行ID会有帮助);
SELECT * FROM Table1 t JOIN
(
SELECT
aborted.server_id asid, aborted.client_id acid, aborted.timestamp ats,
failed.server_id fsid, failed.client_id fcid, failed.timestamp fts
FROM Table1 aborted JOIN Table1 failed
ON aborted.server_id = failed.server_id
AND aborted.client_id = failed.client_id
AND aborted.timestamp < failed.timestamp
LEFT JOIN Table1 filler
ON filler.server_id = aborted.server_id
AND filler.client_id = aborted.client_id
AND aborted.timestamp < filler.timestamp
AND filler.timestamp < failed.timestamp
WHERE filler.timestamp IS NULL
AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'
) u
WHERE t.server_id=asid AND t.client_id=acid AND t.timestamp=ats
OR t.server_id=fsid AND t.client_id=fcid AND t.timestamp=fts
ORDER BY timestamp
答案 1 :(得分:1)
我正在回答这个问题(尽管很晚),因为我想提供更一般的方法。 MySQL没有lag()
或lead()
函数,但您可以使用子查询来实现它。我们的想法是查找client_id / server_id对的下一个时间戳,然后联接回原始数据以获取完整记录。这允许您从“下一个”记录中提取任意数量的记录。它还允许您考虑更复杂的关系(例如,“失败”必须在3分钟内):
select cs.*, csnext.timestamp as nextTimeStamp, csnext.status as nextStatus
from (select cs.*,
(select timestamp
from connection_stats cs2
where cs2.client_id = cs.client_id and
cs2.server_id = cs.server_id and
cs2.timestamp > cs.timestamp
order by cs2.timestamp
limit 1
) as Nexttimestamp
from connection_stats cs
) cs join
connection_stats csnext
on csnext.client_id = cs.client_id and
csnext.server_id = cs.server_id and
csnext.timestamp = cs.nexttimestamp
where cs.status = 'ABORTED' and
csnext.status = 'FAILED'
通过在connection_stats(client_id, server_id, timestamp)
上建立索引,可以大大提高此类查询的性能。
答案 2 :(得分:0)
不太优雅,但应该有效。基于GROUP_CONCAT():
SELECT client_id,server_id,GROUP_CONCAT(status) as all_statuses
FROM statuses
GROUP BY client_id,server_id
HAVING all_statuses LIKE '%ABORTED,FAILED%'
ORDER BY timestamp
答案 3 :(得分:0)
从表t1中选择*,表t2,其中t1.server_id = t2.server_id和 t1.status =&#39; ABORTED&#39;和t2 =&#39; FAILED&#39;
答案 4 :(得分:0)
您可以对状态进行分组,并可以根据顺序进行匹配
SELECT client_id,server_id,GROUP_CONCAT(status) as abort_fail
FROM `table`
GROUP BY client_id,server_id
HAVING abort_fail ='ABORTED,FAILED'
ORDER BY `timestamp` DESC
现在使用GROUP_CONCAT
请记住,1000个字符有字符限制,所以你应该照顾它
答案 5 :(得分:0)
我没有要测试的MySQL数据库,但你可能会给这样的东西一个镜头。可能需要按列添加一些组。
SELECT aborted.*, failed.*
FROM connection_stats aborted
INNER JOIN connection_status nexterror ON aborted.client_id = nexterror.client_id AND nexterror.timestamp > aborted.timestamp
INNER JOIN connection_status failed ON aborted.client_id = failed.client_id AND failed.STATUS = 'FAILED' AND failed.timestamp = MIN(nexterror.timestamp)
WHERE aborted.STATUS = 'ABORTED'
答案 6 :(得分:0)
SELECT t0.clientid, t0.serverid
, t0.logtime AS abort_time
, t1.logtime AS fail_time
FROM tmp t0
JOIN tmp t1 ON t1.clientid = t0.clientid AND t1.serverid = t0.serverid
-- t1 after t0
AND t1.logtime > t0.logtime
WHERE t0. status = 'ABORTED'
AND t1. status = 'FAILED'
-- no records inbetween 'aborted' and 'failed'
-- (not even different 'aborted' and 'failed' records)
AND NOT EXISTS (
SELECT *
FROM tmp x
WHERE x.clientid = t0.clientid AND x.serverid = t0.serverid
AND x.logtime > t0.logtime
AND x.logtime < t1.logtime
)
;
更新:如果您要检索未加入的两个记录,但是作为单独记录,您可以执行以下操作:
SELECT t0.*
FROM tmp t0
JOIN (
SELECT t1.clientid, t1.serverid
, t1.logtime AS abort_time
, t2.logtime AS fail_time
FROM tmp t1
JOIN tmp t2 ON t2.clientid = t1.clientid AND t2.serverid = t1.serverid
-- t2 after t1
AND t2.logtime > t1.logtime
WHERE t1. status = 'ABORTED'
AND t2. status = 'FAILED'
-- no records inbetween 'aborted' and 'failed'
-- (not even different 'aborted' and 'failed' records)
AND NOT EXISTS (
SELECT *
FROM tmp x
WHERE x.clientid = t1.clientid AND x.serverid = t1.serverid
AND x.logtime > t1.logtime
AND x.LOGTIME < t2.logtime
)
) two ON two.clientid = t0.clientid AND two.serverid = t0.serverid
AND (two.abort_time = t0.logtime OR two.fail_time = t0.logtime)
;
,或者同样重写为EXISTS子句,有时候会更清晰,因为t1,t2表不会泄漏到外部查询中:
SELECT *
FROM tmp t0
WHERE EXISTS (
SELECT *
FROM tmp t1
JOIN tmp t2 ON t2.clientid = t1.clientid AND t2.serverid = t1.serverid
-- t2 after t1
AND t2.logtime > t1.logtime
WHERE t1. status = 'ABORTED'
AND t2. status = 'FAILED'
AND t1.clientid = t0.clientid AND t1.serverid = t0.serverid
AND t1.logtime = t0.logtime OR t2.logtime = t0.logtime
-- no records inbetween 'aborted' and 'failed'
-- (not even different 'aborted' and 'failed' records)
AND NOT EXISTS (
SELECT *
FROM tmp x
WHERE x.clientid = t1.clientid AND x.serverid = t1.serverid
AND x.logtime > t1.logtime
AND x.LOGTIME < t2.logtime
)
)
;