我有一张桌子,里面有关于进入大门的作品的记录。
DECLARE @doorStatistics TABLE
( id INT IDENTITY,
[user] VARCHAR(250),
accessDate DATETIME,
accessType VARCHAR(5)
)
样本记录:
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('John Wayne','2009-09-01 07:02:43.000','IN')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('Bruce Willis','2009-09-01 07:12:43.000','IN')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('Bruce Willis','2009-09-01 07:22:43.000','OUT')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('John Wayne','2009-09-01 07:32:43.000','OUT')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('John Wayne','2009-09-01 07:37:43.000','IN')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('Bruce Willis','2009-09-01 07:42:43.000','IN')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('John Wayne','2009-09-01 07:48:43.000','OUT')
INSERT INTO @doorStatistics([user],accessDate,accessType) VALUES ('Bruce Willis','2009-09-01 07:52:43.000','OUT')
我想要做的是一个查询,它给出了以下结果(基于上面的例子):
| user | date | inHour | outHour |
|--------------|------------|----------|----------|
| John Wayne | 2009-09-01 | 07:02:43 | 07:48:43 |
| Bruce Willis | 2009-09-01 | 07:12:43 | 07:22:43 |
| John Wayne | 2009-09-02 | 07:37:43 | 07:48:43 |
| Bruce Willis | 2009-09-02 | 07:42:43 | 07:52:43 |
我做的查询如下:
SELECT [user], accessDate AS [in date],
(SELECT MIN(accessDate)
FROM @doorStatistics ds2
WHERE accessType = 'OUT'
AND ds2.accessDate > ds.accessDate
AND ds.[user] = ds2.[user]) AS [out date]
FROM @doorStatistics ds
WHERE accessType = 'IN'
但这并不好,因为当用户忘记注册他/她的入口时,它会产生例如这样的事情:
| user | date | inHour | outHour |
|--------------|------------|----------|----------|
| John Wayne | 2009-09-02 | 07:02:43 | 07:48:43 |
| John Wayne | 2009-09-02 | 07:02:43 | 09:26:43 |
虽然应该
| user | date | inHour | outHour |
|--------------|------------|----------|----------|
| John Wayne | 2009-09-02 | 07:02:43 | 07:48:43 |
| John Wayne | 2009-09-02 | NULL | 09:26:43 |
查询不好的第二个原因是性能。我有超过200 000条记录,每行的SELECT都会减慢查询速度。
可能的解决方案可能是加入两个表
SELECT * FROM @doorStatistics WHERE accessType = 'IN'
与
SELECT * FROM @doorStatistics WHERE accessType = 'OUT'
但我不知道要获得正确日期的条件。也许有一些MAX或MIN功能可以放在那里,但我不知道。
我不想创建临时表并使用游标。
答案 0 :(得分:1)
在为具有持续时间的时间事件设计数据库时,最好将“IN”时间和“OUT”时间放在同一行上。
您需要做的所有查询都非常容易。
请参阅第48页和第154页的“Joe Celko's SQL Programming Style”,其中谈到了时间凝聚力。
答案 1 :(得分:1)
提高性能:
accessDate
列重命名为accessDateTime
accessDateTime
创建一个PERSISTENT计算列(如下所示)。然后,您需要的索引将仅包含accessDate
列,您将使用该列与user
accessDate
列定义:
accessDate AS CONVERT(SMALLDATETIME, CONVERT(CHAR(8), accessDateTime, 112), 112) PERSISTED
现在,鉴于你已经完成了并且你有SQL-2005 +,这个非常长的查询应该完成这项工作:
WITH MatchIN (in_id, out_id)
AS (SELECT s.id, CASE WHEN COALESCE(y.id, s.id) = s.id THEN x.id ELSE NULL END
FROM @doorStatistics s
LEFT JOIN @doorStatistics x
ON x.id = (SELECT TOP 1 z.id
FROM @doorStatistics z
WHERE z."user" = s."user"
AND z.accessType = 'OUT'
AND z.accessDate = s.accessDate
AND z.accessDateTime >= s.accessDateTime
ORDER BY z.accessDateTime ASC
)
LEFT JOIN @doorStatistics y
ON y.id = (SELECT TOP 1 z.id
FROM @doorStatistics z
WHERE z."user" = s."user"
AND z.accessType = 'IN'
AND z.accessDate = s.accessDate
AND z.accessDateTime >= s.accessDateTime
AND z.accessDateTime <= x.accessDateTime
ORDER BY z.accessDateTime DESC
)
WHERE s.accessType = 'IN'
)
, MatchOUT (out_id, in_id)
AS (SELECT s.id, CASE WHEN COALESCE(y.id, s.id) = s.id THEN x.id ELSE NULL END
FROM @doorStatistics s
LEFT JOIN @doorStatistics x
ON x.id = (SELECT TOP 1 z.id
FROM @doorStatistics z
WHERE z."user" = s."user"
AND z.accessType = 'IN'
AND z.accessDate = s.accessDate
AND z.accessDateTime <= s.accessDateTime
ORDER BY z.accessDateTime DESC
)
LEFT JOIN @doorStatistics y
ON y.id = (SELECT TOP 1 z.id
FROM @doorStatistics z
WHERE z."user" = s."user"
AND z.accessType = 'OUT'
AND z.accessDate = s.accessDate
AND z.accessDateTime <= s.accessDateTime
AND z.accessDateTime >= x.accessDateTime
ORDER BY z.accessDateTime ASC
)
WHERE s.accessType = 'OUT'
)
SELECT COALESCE(i."user", o."user") AS "user",
COALESCE(i.accessDate, o.accessDate) AS "date",
CONVERT(CHAR(10), i.accessDateTime, 108) AS "inHour",
CONVERT(CHAR(10), o.accessDateTime, 108) AS "outHour"
FROM (SELECT in_id, out_id FROM MatchIN
UNION -- this will eliminate duplicates as the same time
SELECT in_id, out_id FROM MatchOUT
) x
LEFT JOIN @doorStatistics i
ON i.id = x.in_id
LEFT JOIN @doorStatistics o
ON o.id = x.out_id
ORDER BY "user", "date", "inHour"
要测试缺失行的处理,只需注释掉一些测试数据的INSERT语句。
答案 2 :(得分:1)
在确保没有介入的IN记录(这对应于某人在没有离开建筑物的情况下获得IN两次)时,您需要为给定用户的每个IN记录选择最小OUT记录。这需要一些适度棘手的SQL(例如,一个NOT EXISTS子句)。因此,您将在表上进行自联接,并在同一个表上添加NOT EXISTS子查询。只要确保你明确地对表的所有引用进行别名。