我使用SQL Server作为仓库来分析日志文件。这些日志文件带有一种商业层次结构(本例中为worker):
Log Entry Id, Log Message
1 , Start Worker
2 , Do Cool Stuff
3 , Start Worker
4 , Do further cool stuff
5 , Start Worker
6 , This is a lot of working
7 , End worker
8 , End worker
9 , End worker
我需要将日志条目与当前工作者联系起来。规则很简单:一旦开始工作"找到消息,将以下所有日志条目分配给此工作程序。在示例层次结构中,这意味着:
Log Entry Id, Log Message , Worker
1 , Start Worker , 1 (we take the entry id as worker id)
2 , Do Cool Stuff , 1
3 , Start Worker , 3
4 , Do further cool stuff , 3
5 , Start Worker , 5
6 , This is a lot of working , 5
7 , End worker , 5
8 , End worker , 3
9 , End worker , 1
目前我正在使用存储过程迭代所有日志条目并使用游标基本上使用堆栈来建立日志条目和工作者之间的关系:
CREATE PROCEDURE CalculateRelations
AS
BEGIN
DECLARE entries_cur CURSOR FOR
SELECT Id, LogMessage
FROM LogEntries
ORDER BY Id;
DECLARE @Id BIGINT;
DECLARE @LogMessage VARCHAR(128);
DECLARE @ParentWorker BIGINT;
DECLARE @WorkerStack VARCHAR(MAX) = '';
OPEN entries_cur;
FETCH NEXT FROM entries_cur INTO @Id, @LogMessage;
WHILE @@FETCH_STATUS = 0
BEGIN
EXEC dbo.GetParentWorker @WorkerStack OUT, @Id, @LogMessage, @ParentWorker OUT;
UPDATE LogEntries
SET ParentWorker = @ParentWorker
WHERE Id = @Id;
FETCH NEXT FROM entries_cur INTO @Id, @LogMessage;
END;
CLOSE entries_cur;
DEALLOCATE entries_cur;
END;
GO
GetParentWorker
是一个存储过程,它使用给定的VARCHAR
变量WorkerStack
作为堆栈。这意味着
Id
添加(推送)到VARCHAR
Id
VARCHAR
Id
的{{1}}而不修改现在我想知道是否可以用VARCHAR
语句替换这个游标构造。我在SQL和SQL Server中没那么深,但可能通过动态变量赋值UPDATE
以及CASE
的返回值的使用来实现这一点吗?
答案 0 :(得分:0)
我认为这与Ian相似,但我会发布一个与缩进级别略有不同的方法。我认为你肯定希望通过一些索引将缩进级别放入表中,否则这对大型表来说会很慢。
我使用CTE来计算缩进级别(基本上只需在我们点击开始或结束时加上和减去一个,在前面的行上使用窗口函数,在当前行中结束worker的特殊情况)。在这个玩具解决方案之外,您希望将前面的行限制为没有指定工作者的行,并且在此之前的行数最后一次为零。
然后我们就可以找到之前的“Start Start'具有相同的水平。这些可能在预处理中被标记并编入索引以便更快地查找。
<强>更新强>
通过引入窗口函数CTE来计算工作者ID,简化更新语句。这应该减少单个行查找并提高更新中的性能。见SQL Fiddle
WITH
WorkerNestingLevel AS (
SELECT
AuditLog.LogId
, AuditLog.LogMessage
, SUM( CASE LogMessage WHEN 'Start Worker' THEN 1 WHEN 'End Worker' THEN -1 ELSE 0 END ) OVER (ORDER BY LogId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
+ CASE LogMessage WHEN 'End Worker' THEN 1 ELSE 0 END AS [WorkerLevel]
FROM
AuditLog
)
, WorkerBatch AS (
SELECT
WorkerNestingLevel.LogId
, MAX( CASE WorkerNestingLevel.LogMessage WHEN 'Start Worker' THEN WorkerNestingLevel.LogId ELSE NULL END) OVER (PARTITION BY WorkerNestingLevel.WorkerLevel ORDER BY WorkerNestingLevel.LogId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS WorkerId
FROM
WorkerNestingLevel
)
UPDATE
AuditLog
SET
WorkerId = WorkerBatch.WorkerId
FROM
AuditLog
JOIN
WorkerBatch ON (WorkerBatch.LogID = AuditLog.LogId);