我有一个包含日期和列的表格,告诉它是否是“连接”系列日期中的第一个日期。 实施例
╔═══════════╦════════════╦═══════╗
║ person_id ║ DATE ║ FIRST ║
╠═══════════╬════════════╬═══════╣
║ 1 ║ 2013-05-31 ║ 1 ║
║ 1 ║ 2013-06-01 ║ 0 ║
║ 1 ║ 2013-06-02 ║ 0 ║
║ 15 ║ 2013-07-08 ║ 1 ║
║ 15 ║ 2013-07-09 ║ 0 ║
║ 1 ║ 2013-07-30 ║ 1 ║
║ 1 ║ 2013-07-31 ║ 0 ║
║ 1 ║ 2013-08-01 ║ 0 ║
╚═══════════╩════════════╩═══════╝
我需要一个新表,其中包含每个系列的开始日期和结束日期的列。例如:
╔═══════════╦════════════╦════════════╗
║ person_id ║ START_DATE ║ END_DATE ║
╠═══════════╬════════════╬════════════╣
║ 1 ║ 2013-05-31 ║ 2013-06-02 ║
║ 15 ║ 2013-07-08 ║ 2013-07-09 ║
║ 1 ║ 2013-07-30 ║ 2013-08-01 ║
╚═══════════╩════════════╩════════════╝
不使用while循环是否可行? 我尝试了一个while循环,但它会慢下来。该表约有10万条记录。
我试过的循环看起来如下:
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id('dbo.temp_table'))
drop table temp_table;
go
SELECT
[person_id],
[date],
[first],
0 AS Processed,
N = ROW_NUMBER() OVER (ORDER BY p_id, datum)
INTO temp_table
FROM [person_dates]
ORDER BY person_id, date
go
declare @N int
declare @N2 int
declare @P_ID int
declare @DATE varchar(10)
declare @DATE2 varchar(10)
declare @start_date datetime
declare @end_date datetime
While (Select Count(*) From temp_table Where Processed = 0 AND first=1) > 0
Begin
Select @N=N,@P_ID=person_id, @DATE=date From temp_table Where Processed = 0 AND first=1 ORDER BY N
set @start_date = CAST(@DATE as datetime)
set @DATE2=@DATE
while (SELECT COUNT(*) FROM temp_table Where Processed = 0 AND first<>1 and
CAST(date as datetime) = dateadd(day,1,CAST(@DATE2 as datetime)) and person_id=@P_ID) > 0
Begin
Select @N2=N,@DATE2=date From temp_table Where Processed = 0 AND first<>1 and
CAST(date as datetime) = dateadd(day,1,CAST(DATE2 as datetime)) and person_id=@P_ID ORDER BY N
Update temp_table Set Processed = 1 Where N = @N2
End
set @end_date=CAST(@DATE2 as datetime)
Update temp_table Set Processed = 1 Where N = @N
End
go
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id('dbo.temp_table'))
drop table temp_table;
go
答案 0 :(得分:1)
您可以使用单个SQL语句,使用自联接
来执行此操作Select distinct person_id, s.Date startDate,
e.Date endDate
From person_dates s
Left Join n -- find next first if one exists
On n.person_id = s.person_id
And First = 1
And n.Date =
(Select Min(date) from person_dates
Where person_id = s.person_id
And First = 1
And date > s.Date)
Join person_dates e -- find last row before next first
On e.person_id = s.person_id
And e.Date =
(Select Max(date) from person_dates
where person_id = s.person_id
And date > s.Date
And date < Coalesce(n.Date, date+1))
Where s.First = 1
答案 1 :(得分:1)
这是一个简单的观察。如果您执行“第一”列的累积总和,那么您将拥有一个定义每个组的列。
在某些数据库中,您可以使用窗口/分析函数执行累积求和。在其他情况下,您需要一个相关的子查询。
select person_id, min(date) as start_date, max(date) as end_date
from (select pd.*,
(select sum(first)
from person_dates pd2
where pd2.person_id = pd.person_id and
pd2.date <= pd.date
) as cumfirst
from person_dates pd
) pd
group by person_id, cumfirst;
使用ANSI标准累积和语法,您可以将其写为:
select person_id, min(date) as start_date, max(date) as end_date
from (select pd.*,
sum(first) over (partition by person_id order by date) as cumFirst
from person_dates pd
) pd
group by person_id, cumfirst;