我有一个python应用程序,它将设备作为测试来检查它们是否存活。
以下是我用来对每个设备进行分组的查询,并根据PingResults中该设备的最后一条记录确定设备是否在线
SELECT
c.ID, c.DeviceName, c.GroupID, c.DeviceIP, p1.Status,
p1.DateTime AS LastUpdate, DeviceGroups.GroupName
FROM
Devices AS c
INNER JOIN
PingResults AS p1 ON c.ID = p1.DeviceID
INNER JOIN
DeviceGroups ON c.GroupID = DeviceGroups.ID
LEFT OUTER JOIN
PingResults AS p2 ON c.ID = p2.DeviceID
AND (p1.DateTime < p2.DateTime OR
p1.DateTime = p2.DateTime AND p1.DeviceID < p2.DeviceID)
WHERE
(p2.ID IS NULL)
查询有效,但目前在pingResults表和28个设备中使用44000条记录运行1:23分钟。
查询输出
DeviceName DeviceIP GroupID ID Status LastUpdate GroupName
Machine 25 192.168.0.226 1 114 True 2018-02-20 09:46:40.717 Machine Terminals
Machine 2 192.168.0.199 1 100 True 2018-02-20 09:48:09.113 Machine Terminals
Machine 3 192.168.0.229 1 101 True 2018-02-20 09:48:12.710 Machine Terminals
Machine 4 192.168.0.224 1 102 True 2018-02-20 09:48:15.123 Machine Terminals
Machine 5 192.168.0.218 1 103 True 2018-02-20 09:48:17.763 Machine Terminals
Machine 6 192.168.0.219 1 104 True 2018-02-20 09:48:19.823 Machine Terminals
Machine 7 192.168.0.217 1 105 False 2018-02-20 09:48:23.763 Machine Terminals
Machine 8 192.168.0.220 1 106 False 2018-02-20 09:48:26.763 Machine Terminals
当前执行计划
有没有什么方法可以优化或更改此查询以提高效率,因为在目前的速度下,当数据库填满时运行时间太长。
答案 0 :(得分:2)
LEFT OUTER JOIN PingResults AS p2
ON c.ID = p2.DeviceID AND (p1.DateTime < p2.DateTime
OR p1.DateTime = p2.DateTime AND p1.DeviceID < p2.DeviceID)
WHERE (p2.ID IS NULL)
-->
WHERE NOT EXISTS(SELECT 1 FROM PingResults AS p2
WHERE c.ID = p2.DeviceID AND (p1.DateTime < p2.DateTime
OR p1.DateTime = p2.DateTime AND p1.DeviceID < p2.DeviceID)
)
为每个表创建聚簇索引!
接着看了一遍:
SELECT c.ID,
c.DeviceName,
c.GroupID,
c.DeviceIP,
p1.[Status],
p1.LastUpdate,
DeviceGroups.GroupName
FROM Devices AS c
INNER JOIN DeviceGroups
ON c.GroupID = DeviceGroups.ID
CROSS APPLY(
SELECT TOP 1 p1.[Status], p1.[DateTime] AS LastUpdate
FROM PingResults AS p1
WHERE p1.DeviceID = c.ID
ORDER BY p1.DateTime DESC
) p1
无需对PingResults
进行两次“读取”。
另一件事是:
ON c.ID = p2.DeviceID
AND (p1.DateTime < p2.DateTime
OR p1.DateTime = p2.DateTime
AND p1.DeviceID < p2.DeviceID ---<<<<<<<<<<<<<<<<<<<<
)
真的? c.ID = p1.DeviceID
和c.ID = p2.DeviceID
之后?
答案 1 :(得分:2)
对于此查询:
SELECT c.ID, c.DeviceName, c.GroupID, c.DeviceIP, p1.Status,
p1.DateTime AS LastUpdate, dg.GroupName
FROM Devices c INNER JOIN
PingResults p1
ON c.ID = p1.DeviceID INNER JOIN
DeviceGroups dg
ON c.GroupID = dg.ID LEFT OUTER JOIN
PingResults p2
ON c.ID = p2.DeviceID AND
(p1.DateTime < p2.DateTime OR
p1.DateTime = p2.DateTime AND p1.DeviceID < p2.DeviceID
)
WHERE p2.ID IS NULL;
您希望确保您拥有索引:
PingResults(deviceId, datetime)
DeviceGroups(id, groupname)
(次要优化)您可能还会发现使用row_number()
编写此内容会加快查询速度:
SELECT c.ID, c.DeviceName, c.GroupID, c.DeviceIP, p1.Status,
p1.DateTime AS LastUpdate, dg.GroupName
FROM Devices c INNER JOIN
(select p1.*, row_number() over (partition by deviceid order by datetime) as seqnum
from PingResults p1
) p1
ON c.ID = p1.DeviceID AND seqnum = 1 INNER JOIN
DeviceGroups dg
ON c.GroupID = dg.ID ;
答案 2 :(得分:1)
您计划中的最大成本是哈希匹配(内部联接)。
要加快速度,请将JOIN(设备和设备组)中涉及的两个表索引到它们加入的列上。
这应该将哈希匹配更改为内循环,这比您在计划中看到的要快得多。
答案 3 :(得分:0)
对我来说,你应该使用row_number:
;WITH CTE as
(
SELECT
c.ID,
c.DeviceName,
c.GroupID,
c.DeviceIP,
p1.Status,
p1.DateTime,
DeviceGroups.GroupName,
row_number() over (partition by c.ID order by p1.DateTime desc) rn
FROM
Devices AS c
INNER JOIN
PingResults AS p1
ON c.ID = p1.DeviceID
INNER JOIN
DeviceGroups
ON c.GroupID = DeviceGroups.ID
)
SELECT
ID,
DeviceName,
GroupID,
DeviceIP,
Status,
DateTime LastUpdate,
DeviceGroups.GroupName
FROM CTE
WHERE rn = 1