我一直在阅读有关数据孤岛的大量内容,并使用CTE或许多子查询进行汇总。大多数人似乎依赖于聪明的数学与日期,这看起来很酷,但我不认为这对我有用。
我们有许多车辆数据记录器在各种时间表上发送状态更新。我正在寻找一种更快,非循环的方式来总结某些状态。
数据通常在行程结束时处理。(点火开关点火)有一个LogTrips提供NodeId,StartTime和EndTime。目前我循环遍历由AssembledTime排序的Log条目,查找StatusText,如%what。我根据客户首选状态执行此操作几次。例如:( StatusText如'%seatbelt%'或StatusText如'%s / b%')和lm.Speed> 10,对于没有安全带的人来说
经过一些阅读后,我可以看到我可以使用row_number()创建一个正确的顺序记录,并在...结束时提取想要使用案例的状态
SELECT RowNumber = ROW_NUMBER() OVER(ORDER BY l.AssembledTime),
l.NodeId,
l.LogId,
l.AssembledTime,
lm.Speed,
lm.StatusText,
StatusSpeed = CASE WHEN lm.StatusText like '%speed%' THEN 1 ELSE 0 END,
StatusAccident = CASE WHEN lm.StatusText like '%accident%' THEN 1 ELSE 0 END, --impact?
StatusSeatbeltDriving = CASE WHEN (lm.StatusText like '%seatbelt%' or lm.StatusText like '%s/b%') and lm.Speed > 10 THEN 1 ELSE 0 END,
StatusSeatbeltIdle = CASE WHEN (lm.StatusText like '%seatbelt%' or lm.StatusText like '%s/b%') and lm.Speed = 0 THEN 1 ELSE 0 END,
Status4wd = CASE WHEN (lm.StatusText like '%4wd%' or lm.StatusText like '%4x4%') THEN 1 ELSE 0 END
FROM Ctrack6.dbo.Logs l
JOIN Ctrack6.dbo.LogMobiles lm on l.LogId = lm.LogId
WHERE l.NodeId = @NodeId
AND l.AssembledTime between @TripStart AND @TripEnd
这将按顺序为我提供设备行程的所有日志列表:
RowNumber NodeId LogId AssembledTime Speed StatusText StatusSpeed StatusAccident StatusSeatbeltDriving StatusSeatbeltIdle Status4wd IsProcessed
1 3099 308815155 2015-05-26 11:05:43.000 0 Start up 0 0 0 0 0 0
2 3099 308815156 2015-05-26 11:05:55.000 0 Driving 0 0 0 0 0 0
3 3099 308815157 2015-05-26 11:06:25.000 10 Driving 0 0 0 0 0 0
4 3099 308815158 2015-05-26 11:06:45.000 11 Driving 0 0 0 0 0 0
5 3099 308815344 2015-05-26 11:07:15.000 0 Driving 0 0 0 0 0 0
6 3099 308815345 2015-05-26 11:07:16.000 0 Seatbelt 0 0 0 1 0 0
7 3099 308815477 2015-05-26 11:07:19.000 0 Seatbelt 0 0 0 1 0 0
8 3099 308815479 2015-05-26 11:07:24.000 0 Seatbelt 0 0 0 1 0 0
9 3099 308815481 2015-05-26 11:07:29.000 0 Seatbelt 0 0 0 1 0 0
10 3099 308815482 2015-05-26 11:07:34.000 0 Seatbelt 0 0 0 1 0 0
11 3099 308815598 2015-05-26 11:07:39.000 0 Seatbelt 0 0 0 1 0 0
12 3099 308815599 2015-05-26 11:07:44.000 0 Seatbelt 0 0 0 1 0 0
13 3099 308815600 2015-05-26 11:07:49.000 0 Seatbelt 0 0 0 1 0 0
14 3099 308815601 2015-05-26 11:07:54.000 0 Seatbelt 0 0 0 1 0 0
15 3099 308815729 2015-05-26 11:08:00.000 0 Seatbelt 0 0 0 1 0 0
16 3099 308815730 2015-05-26 11:08:05.000 0 Seatbelt 0 0 0 1 0 0
17 3099 308815731 2015-05-26 11:08:10.000 0 Seatbelt 0 0 0 1 0 0
18 3099 308815732 2015-05-26 11:08:15.000 0 Seatbelt 0 0 0 1 0 0
19 3099 308816439 2015-05-26 11:08:45.000 0 Seatbelt 0 0 0 1 0 0
20 3099 308816440 2015-05-26 11:09:15.000 0 Seatbelt 0 0 0 1 0 0
21 3099 308816441 2015-05-26 11:09:45.000 0 Seatbelt 0 0 0 1 0 0
22 3099 308816442 2015-05-26 11:10:07.000 0 Ignition off 0 0 0 0 0 0
期望的结果将总结第6-21行。与
如果有多个岛屿,则会有多个摘要
我只是没有得到我可以分组来制作我的岛屿。
答案 0 :(得分:0)
我分组了什么?
我这样做的方式是标记每个州的状态变化'然后我在那个领域做一个总计。这为每个'组提供了一个递增的唯一编号。你可以分组。
我个人总是将这些内容加载到临时表中进行处理,而不是尝试使用一堆内联子选择,原因有两个:
此外,在临时表格中,您可以使用这样的结构来“吞噬”'岛屿:
SELECT 1
WHILE @@ROWCOUNT<> 0
BEGIN
UPDATE TGT
SET ChangeColumn=1
FROM YourTable TGT
INNER JOIN YourTable PriorRow
WHERE PriorRow.RowNum-1 = TGT.RowNum
AND PriorRow.State = TGT.State
AND ChangeColumn=0
END
如果您运行它,它会一直运行,直到找到并标记所有状态更改
答案 1 :(得分:0)
我在以下页面找到了我的解决方案:https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/
更具体地说:
添加了两个表变量
declare @logs table
(
LogId int PRIMARY KEY,
RowNumberAll int,
RowNumberNode int,
NodeId int,
AssembledTime datetime,
Speed int,
StatusText varchar(200),
StatusSpeed bit,
StatusAccident bit,
StatusSeatbeltDriving bit,
StatusSeatbeltIdle bit,
Status4wd bit,
UNIQUE(Nodeid, RowNumberNode),
UNIQUE(RowNumberAll)
)
declare @results table
(
EventType varchar(50),
NodeId int,
StartSeqNo int,
EndSeqNo int,
LogCount int,
UNIQUE(NodeId, StartSeqNo, EventType)
)
添加了一个要查询的附加列RowNumberNode。
RowNumberNode = ROW_NUMBER() OVER(PARTITION BY NodeId ORDER BY l.AssembledTime),
稍微修改了一下示例以使用我的代码。我希望sunmmarised的每个状态都有1个这样的块
INSERT INTO @results (EventType, NodeId, StartSeqNo, EndSeqNo, LogCount)
SELECT 'Speed',
NodeId,
StartSeqNo=MIN(RowNumberNode),
EndSeqNo=MAX(RowNumberNode),
LogCount=MAX(RowNumberNode) - MIN(RowNumberNode) + 1
FROM
(
SELECT NodeId,
RowNumberNode,
rn=RowNumberNode-ROW_NUMBER() OVER (PARTITION BY NodeId ORDER BY RowNumberNode)
FROM @logs
WHERE StatusSpeed=1
) a
GROUP BY NodeId, rn
--HAVING MIN(RowNumberNode) - MAX(RowNumberNode) > 0
ORDER BY NodeId, StartSeqNo;