我对SQL有点生疏。
假设我tblMachineLogs
MachineLogID
,MachineID
,LogTime (date+time)
。
此表填充了来自10台MachineID
1到10台机器的日志,并且其中有很多行。
我想选择例如最后5个日志事件但是每台机器。
提前致谢
答案 0 :(得分:6)
使用Window Function
可以帮助您查找每个组5
log events
(MachineID)
SELECT MachineLogID,
MachineID,
LogTime
FROM (SELECT Row_number()OVER(partition BY MachineID ORDER BY LogTime DESC) Rn,
MachineLogID,
MachineID,
LogTime
FROM tblMachineLogs) a
WHERE rn <= 5
答案 1 :(得分:2)
SQL Server的解决方案。我在SQL Server 2008上测试过它。
想象一下,MachineLogs
有数百万或数十亿行,并且(MachineID, LogTime DESC)
上有索引。使用ROW_NUMBER
的解决方案将扫描整个表(或仅扫描索引,但它将是完整扫描)。如果索引在(MachineID, LogTime ASC)
上,它也会进行额外的昂贵排序。
另一方面,如果我们有一个包含10行的小表Machines
,每个MachineID
一个,那么就可以编写一个查询,在索引上搜索10而不是扫描整个大桌子。
我将创建一个包含100万行的大表MachineLogs
和包含10行的小表Machines
并测试两个解决方案。
表Machines
将有10行:
CREATE TABLE [dbo].[Machines](
[ID] [int] NOT NULL,
CONSTRAINT [PK_Machines] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO [dbo].[Machines]
([ID])
VALUES
(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
;
索引在([MachineID] ASC, [LogTime] DESC)
上的大表:
CREATE TABLE [dbo].[MachineLogs](
[ID] [int] IDENTITY(1,1) NOT NULL,
[MachineID] [int] NOT NULL,
[LogTime] [datetime] NOT NULL,
CONSTRAINT [PK_MachineLogs] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_MachineID_LogTime] ON [dbo].[MachineLogs]
(
[MachineID] ASC,
[LogTime] DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE [dbo].[MachineLogs] WITH CHECK ADD CONSTRAINT [FK_MachineLogs_Machines] FOREIGN KEY([MachineID])
REFERENCES [dbo].[Machines] ([ID])
GO
ALTER TABLE [dbo].[MachineLogs] CHECK CONSTRAINT [FK_MachineLogs_Machines]
GO
生成1M行:
WITH
CTE_Times
AS
(
-- generate 100,000 rows with random datetimes between 2001-01-01 and ~2004-03-01 (100,000,000 seconds)
SELECT TOP(100000)
DATEADD(second, 100000000 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5), '20010101') AS LogTime
FROM
sys.all_objects AS X1
CROSS JOIN sys.all_objects AS X2
)
-- generate 1M rows
INSERT INTO dbo.MachineLogs
(MachineID
,LogTime)
SELECT
dbo.Machines.ID
,CTE_Times.LogTime
FROM
dbo.Machines
CROSS JOIN CTE_Times
;
WITH
CTE_rn
AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY MachineID ORDER BY LogTime DESC) AS rn
,ID
,MachineID
,LogTime
FROM MachineLogs
)
SELECT
ID
,MachineID
,LogTime
FROM CTE_rn
WHERE rn <= 5
;
SELECT
CA.ID
,CA.MachineID
,CA.LogTime
FROM
Machines
CROSS APPLY
(
SELECT TOP(5)
MachineLogs.ID
,MachineLogs.MachineID
,MachineLogs.LogTime
FROM MachineLogs
WHERE
MachineLogs.MachineID = Machines.ID
ORDER BY LogTime DESC
) AS CA
;
您可以看到ROW_NUMBER
的解决方案进行了索引扫描,而CROSS APPLY
解决方案会进行索引搜索。
SET STATISTICS IO ON;
ROW_NUMBER
的解决方案:
(50 row(s) affected)
Table 'MachineLogs'. Scan count 1, logical reads 2365, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CROSS APPLY
的解决方案:
(50 row(s) affected)
Table 'MachineLogs'. Scan count 10, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Machines'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
答案 2 :(得分:1)
为每台选择前5行的机器创建一个查询,将其全部联合起来,然后按日志时间按降序排序(以获取最后5行)。 以下是两台机器的示例,只需填写缺失的8台。
--drop table #tmp
SELECT *
into #tmp
FROM
(
select 1 as MachineLogID, 1 as MachineID , GETDATE() - 0.1 LogTime
UNION
select 2 as MachineLogID, 1 as MachineID , GETDATE()- 0.2 LogTime
UNION
select 3 as MachineLogID, 1 as MachineID , GETDATE()- 0.3 LogTime
UNION
select 4 as MachineLogID, 1 as MachineID , GETDATE()- 0.4 LogTime
UNION
select 5 as MachineLogID, 1 as MachineID , GETDATE()- 0.5 LogTime
UNION
select 6 as MachineLogID, 1 as MachineID , GETDATE() - 0.6 LogTime
UNION
select 7 as MachineLogID, 2 as MachineID , GETDATE()- 0.7 LogTime
UNION
select 8 as MachineLogID, 2 as MachineID , GETDATE() - 0.8 LogTime
UNION
select 9 as MachineLogID, 2 as MachineID , GETDATE() - 0.9 LogTime
UNION
select 10 as MachineLogID, 2 as MachineID , GETDATE() - 0.10 LogTime
UNION
select 11 as MachineLogID, 2 as MachineID , GETDATE() - 0.11 LogTime
UNION
select 12 as MachineLogID, 2 as MachineID , GETDATE() - 0.12 LogTime
) a
SELECT *
FROM
(
SELECT top 5 *
FROM #tmp a
where machineId = 1
order by LogTime desc
union
SELECT top 5 *
FROM #tmp a
where machineId = 2
order by LogTime desc
) a
order by a.machineId , a.LogTime desc
答案 3 :(得分:0)
为了简单起见,我会在每台机器上进行单独查询。
如果您使用的是MySQL:
SELECT MachineLogID, MachineID, LogTime FROM tblMachineLogs WHERE MachineID='str_machineid' ORDER BY LogTime DESC LIMIT 5;
这将从str_machineid
指示的ID的机器返回最后5个事件日志项。如果机器ID是数字字段(并且它应该),则删除引号。
答案 4 :(得分:0)
Select top 5 * from yourTable where machineId =1
Union all
Select top 5 * from yourtable where machineid =2
Union all
.
.
.
.
Select top 5 * from yoyrtable
Where machineid=10