我有这个表来代表来自名为PacketsByDirection的流量捕获的数据包(仅显示相关字段):
FrameNumber FrameTimeEpoch FlowID Direction
288 1430221042.150789000 29 Direction A
289 1430221042.150922000 29 Direction B
现在,这个表有大约200万行(数据包),我需要计算的是,对于每个数据包,他和之前的数据包之间的时间差是否具有相同的Direction和相同的FlowID
我已经使用此查询完成了此操作,并将索引添加到上一个表以使查询更快。
SELECT t1.FrameNumber, flowid, direction,
FrameTimeEpoch - IFNULL((
SELECT MAX(FrameTimeEpoch)
FROM PacketsByDirection
WHERE flowid = t1.flowid
AND Direction LIKE t1.Direction
AND FrameNumber < t1.FrameNumber)
,FrameTimeEpoch) AS TimeFromLastPacketFromSameDirection
FROM PacketsByDirection AS t1
结果就像
FrameNumber FlowID Direction TimeFromLastPacketFromSameDirection
288 29 Direction A 0
289 29 Direction B 0
290 29 Direction A 5.422
291 29 Direction B 4.356
292 30 Direction A 0
293 30 Direction A 1.302
等等。现在,对于600k行,此查询大约需要1小时,现在我正在处理数百万行,所以我甚至不想尝试它。这是&#34;解释&#34;现在查询的输出(那是很多迭代):
所以我的问题是,还有另一种更有效的方法吗?
由于
编辑:这是表格的定义
CREATE TABLE `packetsbydirection` (
`FrameNumber` int(11) NOT NULL DEFAULT '0',
`FrameTimeEpoch` varchar(45) NOT NULL,
`IPSrc` varchar(45) NOT NULL,
`TCPSrcPort` varchar(45) DEFAULT NULL,
`UDPSrcport` varchar(45) DEFAULT NULL,
`IPDst` varchar(45) NOT NULL,
`TCPDstport` varchar(45) DEFAULT NULL,
`UDPDstport` varchar(45) DEFAULT NULL,
`IPLength` varchar(45) NOT NULL,
`FlowID` int(11) NOT NULL,
`Direction` varchar(11) CHARACTER SET utf8 DEFAULT NULL,
KEY `Index2` (`Direction`),
KEY `Index3` (`FlowID`),
KEY `Index4` (`FrameNumber`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
答案 0 :(得分:3)
不确定这是否可行,但运行数字可能更快?
SELECT
FrameNumber,
case when FlowID <> @currflow or Direction <> @currdir then @diff := 0 else @diff := FrameTimeEpoch - @epoch end as TimeFromLastPacketFromSameDirection
, @currflow := FlowID, @currdir := Direction, @diff, @epoch := FrameTimeEpoch
FROM
packetsbydirection, (select @epoch := 0, @currflow :="", @currdir := "", @diff := 0) as tmp
ORDER BY FlowID, Direction, FrameTimeEpoch