我试图在表格中处理数据,在我看来这有点不完整,我无法弄清楚如何解决这个问题,或者如何开始构建问题,看看我是否尝试过使用SQL甚至可以实现。这是我正在使用的数据的假设表示(我以CSV格式输入数据,因为此文本字段不支持表格格式化):
Date,Time,Traveler,Source,Destination,Travel Status
9/20/2014,1:00pm,James,Station A,Station B,Scheduled
9/20/2014,1:10pm,James,Station A,Station B,Traveling
9/20/2014,1:40pm,James,,Station B,Arrived
9/20/2014,1:00pm,Ann,Station B,Station A,Scheduled
9/20/2014,1:10pm,Ann,Station B,Station A,Traveling
9/20/2014,1:40pm,Ann,,Station A,Arrived
9/20/2014,1:00pm,Karl,Station A,Station B,Scheduled
9/20/2014,1:10pm,Karl,Station A,Station B,Traveling
9/20/2014,1:40pm,Karl,,Station B,Arrived
9/20/2014,1:00pm,Joyce,Station B,Station A,Scheduled
9/20/2014,1:10pm,Joyce,Station B,Station A,Traveling
9/20/2014,1:40pm,Joyce,,Station A,Arrived
9/20/2014,1:00pm,Kelly,Station B,Station B,Scheduled
9/20/2014,1:10pm,Kelly,Station B,Station B,Traveling
9/20/2014,1:40pm,Kelly,,Station B,Arrived
9/20/2014,1:00pm,Sam,Station A,Station A,Scheduled
9/20/2014,1:10pm,Sam,Station A,Station A,Traveling
9/20/2014,1:40pm,Sam,,Station A,Arrived
我试图计算出多少"类型"我们有多少人到达,例如有多少A-> A型到达,有多少B-> B型和多少A-> B和B-> A.
如果数据是这样的:
Date,Time,Traveler,Source,Destination,Travel Status
9/20/2014,1:00pm,James,Station A,Station B,Scheduled
9/20/2014,1:10pm,James,Station A,Station B,Traveling
9/20/2014,1:40pm,James,Station A,Station B,Arrived
9/20/2014,1:00pm,Ann,Station B,Station A,Scheduled
9/20/2014,1:10pm,Ann,Station B,Station A,Traveling
9/20/2014,1:40pm,Ann,Station B,Station A,Arrived
这个简单的查询将针对每种到达类型实现此目的,即对于类型A-> B:
SELECT COUNT(*) FROM TRAVEL_TBL WHERE
Travel Status = 'Arrived' AND Source = 'Station A'
AND Destination = 'Station B';
但是由于包含"到达"的记录中缺少Source字段。输入,如何进行查询以查找计数?我想唯一的方法是按顺序按时间顺序比较每个记录每个旅行者的顺序,并跟踪旅行的安排时间以及他们是否到达并增加该基数的计数。这是否可以使用SQL,或者你只能用Java,PHP或任何主机语言编写一个应用程序来执行逻辑吗?
答案 0 :(得分:2)
与MS SQL 2012+兼容的一种解决方案是使用LAG()函数访问以前的行:
SELECT COUNT(*) AS "Count A-B"
FROM (
SELECT
Date, Time, Traveler,
CASE
WHEN Source IS NULL THEN LAG(Source,1) OVER (PARTITION BY Date, Traveler ORDER BY Date)
ELSE Source
END AS Source,
Destination,
[Travel Status]
from TRAVEL_TBL) derived_table
WHERE [Travel Status] = 'Arrived' AND Source = 'Station A' AND Destination = 'Station B';
或者在具有自联接的cte中使用ROW_NUMBER()(在大多数主要数据库中应该可以使用的函数)的更通用的版本:
;WITH cte AS (
SELECT
Date, Time, Traveler,
ROW_NUMBER() OVER (ORDER BY Traveler, Date, Time) rn,
Source,
Destination,
[Travel Status]
FROM TRAVEL_TBL
)
SELECT COUNT(*) AS "Count A-B"
FROM (
SELECT
c.Date, c.Time, c.Traveler,
CASE
WHEN c.Source IS NULL THEN c2.source
ELSE c.Source
END AS Source,
c.Destination,
c.[Travel Status]
FROM cte c
LEFT JOIN cte c2 ON c.rn = c2.rn+1
) derived_table
WHERE [Travel Status] = 'Arrived' AND Source = 'Station A' AND Destination = 'Station B';