在SQL Server 2014中将行与日期分开

时间:2017-06-06 14:40:37

标签: sql sql-server sql-server-2014

我遇到了SQL问题。我有下表:

declare @t table (START_DATE datetime,
                  END_DATE datetime, 
                  GROSS_SALES_PRICE decimal(10,2)
                 );

insert into @t 
values ('2014-08-06 00:00:00.000', '2014-10-06 23:59:59.000', 29.99),
       ('2014-09-06 00:00:00.000', '2014-09-09 23:59:59.000', 32.99),
       ('2014-09-10 00:00:00.000', '2014-09-30 23:59:59.000', 32.99),
       ('2014-10-07 00:00:00.000', '2049-12-31 23:59:59.000', 34.99)

我想分开重叠的日期。例如,我在第一行START_DATE 2014-08-06和2014-12-06 END_DATE。我们可以看到第二行和第三行的日期在第一行的这段时间内。

所以我想将它们分开如下:

declare @t2 table (START_DATE datetime,
                   END_DATE datetime, 
                   GROSS_SALES_PRICE decimal(10,2)
                  );

insert into @t2 
values ('2014-08-06 00:00:00.000', '2014-09-05 23:59:59.000', 29.99),
       ('2014-09-06 00:00:00.000', '2014-09-09 23:59:59.000', 32.99),
       ('2014-09-10 00:00:00.000', '2014-09-30 23:59:59.000', 32.99),
       ('2014-10-01 00:00:00.000', '2014-10-06 23:59:59.000', 29.99),
       ('2014-10-07 00:00:00.000', '2049-12-31 23:59:59.000', 34.99)

所以第二行和第三行保持不变。第一行应该有新的END_DATE。我们也有新的排。 GROSS_SALES_PRICE应保持在内部时段。感谢帮助。我正在使用SQL Server 2014

5 个答案:

答案 0 :(得分:3)

日历/日期表可以简化此操作,但我们也可以使用查询使用common table expression生成临时日期表。

从那里,我们可以解决这个问题作为一个空白和岛屿风格的问题。使用日期表并使用outer apply()获取start_dategross_sales_price的最新值,我们可以使用两个row_number()来确定要重新聚合的组。第一个按date排序,少于另一个按我们最新start_date并按date排序的值进行分区。

然后,您可以将公用表表达式src的结果转储到临时表中并使用它进行插入/删除,或者可以使用merge使用src

/* -- dates --*/
declare @fromdate datetime, @thrudate datetime;
select  @fromdate = min(start_date), @thrudate = max(end_date) from #t;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
  select top (datediff(day, @fromdate, @thrudate)+1) 
      [Date]=convert(datetime,dateadd(day,row_number() over(order by (select 1))-1,@fromdate))
    , [End_Date]=convert(datetime,dateadd(millisecond,-3,dateadd(day,row_number() over(order by (select 1)),@fromdate)))
  from n as deka cross join n as hecto cross join n as kilo
                 cross join n as tenK cross join n as hundredK
   order by [Date]
)
/* -- islands -- */
, cte as (
select 
    start_date = d.date
  , end_date   = d.end_date
  , x.gross_sales_price
  , grp = row_number() over (order by d.date)
        - row_number() over (partition by x.start_date order by d.date)
from dates d
  outer apply (
    select top 1 l.start_date, l.gross_sales_price
    from #t l
    where d.date >= l.start_date
      and d.date <= l.end_date
    order by l.start_date desc
    ) x
)
/* -- aggregated islands -- */
, src as (
select 
    start_date = min(start_date)
  , end_date   = max(end_date)  
  , gross_sales_price
from cte
group by gross_sales_price, grp
)
/* -- merge -- */
merge #t with (holdlock) as target
using src as source
  on target.start_date = source.start_date
 and target.end_date   = source.end_date
 and target.gross_sales_price = source.gross_sales_price
when not matched by target 
  then insert (start_date, end_date, gross_sales_price)
    values (start_date, end_date, gross_sales_price)
when not matched by source 
  then delete
output $action, inserted.*, deleted.*;
/* -- results -- */
select 
    start_date
  , end_date  
  , gross_sales_price
from #t 
order by start_date

rextester演示:http://rextester.com/MFXCQQ90933

merge输出(您无需输出此信息,仅显示演示版):

+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+
| $action |     START_DATE      |      END_DATE       | GROSS_SALES_PRICE |     START_DATE      |      END_DATE       | GROSS_SALES_PRICE |
+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+
| INSERT  | 2014-10-01 00:00:00 | 2014-10-06 23:59:59 | 29.99             | NULL                | NULL                | NULL              |
| INSERT  | 2014-08-06 00:00:00 | 2014-09-05 23:59:59 | 29.99             | NULL                | NULL                | NULL              |
| DELETE  | NULL                | NULL                | NULL              | 2014-08-06 00:00:00 | 2014-10-06 23:59:59 | 29.99             |
+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+

结果:

+-------------------------+-------------------------+-------------------+
|       start_date        |        end_date         | gross_sales_price |
+-------------------------+-------------------------+-------------------+
| 2014-08-06 00:00:00.000 | 2014-09-05 23:59:59.997 | 29.99             |
| 2014-09-06 00:00:00.000 | 2014-09-09 23:59:59.997 | 32.99             |
| 2014-09-10 00:00:00.000 | 2014-09-30 23:59:59.997 | 32.99             |
| 2014-10-01 00:00:00.000 | 2014-10-06 23:59:59.997 | 29.99             |
| 2014-10-07 00:00:00.000 | 2049-12-31 23:59:59.997 | 34.99             |
+-------------------------+-------------------------+-------------------+

日历和数字表参考:

merge参考:

答案 1 :(得分:1)

除了使用datetime2类型而非datetime之外,我建议您使用[Closed; Open)间隔而不是[Closed; Closed]。换句话说,请使用2014-08-06 00:00:00.000, 2014-09-06 00:00:00.000代替2014-08-06 00:00:00.000, 2014-09-05 23:59:59.000。具体来说,因为59.999类型的00.000将四舍五入为datetime,但datetime2(3)不会四舍五入为[Closed; Open)。您不希望依赖数据类型的内部详细信息。

此外,declare @t table (START_DATE datetime2(0), END_DATE datetime2(0), GROSS_SALES_PRICE decimal(10,2) ); insert into @t values -- |------| 11 ('2001-01-01 00:00:00', '2001-01-10 00:00:00', 11), -- |------| 10 -- |------| 20 ('2010-01-01 00:00:00', '2010-01-10 00:00:00', 10), ('2010-01-05 00:00:00', '2010-01-20 00:00:00', 20), -- |----------| 30 -- |------| 40 ('2010-02-01 00:00:00', '2010-02-20 00:00:00', 30), ('2010-02-05 00:00:00', '2010-02-20 00:00:00', 40), -- |----------| 50 -- |----------| 60 ('2010-03-01 00:00:00', '2010-03-20 00:00:00', 50), ('2010-03-01 00:00:00', '2010-03-20 00:00:00', 60), -- |----------| 70 -- |------| 80 ('2010-04-01 00:00:00', '2010-04-20 00:00:00', 70), ('2010-04-05 00:00:00', '2010-04-15 00:00:00', 80), -- |-----------------------------| 29.99 -- |---------| 32.99 -- |---------| 32.99 -- |----------| 34.99 ('2014-08-06 00:00:00', '2014-10-07 00:00:00', 29.99), ('2014-09-06 00:00:00', '2014-09-10 00:00:00', 32.99), ('2014-09-10 00:00:00', '2014-10-01 00:00:00', 32.99), ('2014-10-07 00:00:00', '2050-01-01 00:00:00', 34.99); 间隔在查询中更容易处理,您将在下面看到。

主要思想是将所有开始和结束日期(边界)放在一个列表中,并带有一个标志,指示它是间隔的开头还是结尾。当标志的运行总数变为零时,表示所有重叠的间隔都已结束。

示例数据

我在几个重叠间隔的情况下扩展了您的样本数据。

WITH
CTE_Boundaries
AS
(
    SELECT
        START_DATE AS dt
        ,+1 AS Flag
        ,GROSS_SALES_PRICE AS Price
    FROM @T

    UNION ALL

    SELECT
        END_DATE AS dt
        ,-1 AS Flag
        ,GROSS_SALES_PRICE AS Price
    FROM @T
)
,CTE_Intervals
AS
(
    SELECT
        dt
        ,Flag
        ,Price
        ,SUM(Flag) OVER (ORDER BY dt, Flag ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SumFlag
        ,LEAD(dt) OVER (ORDER BY dt, Flag) AS NextDate
        ,LEAD(Price) OVER (ORDER BY dt, Flag) AS NextPrice
    FROM CTE_Boundaries
)
SELECT
    dt AS StartDate
    ,NextDate AS EndDate
    ,CASE WHEN Flag = 1 THEN Price ELSE NextPrice END AS Price
FROM CTE_Intervals
WHERE
    SumFlag > 0
    AND dt <> NextDate
ORDER BY StartDate
;

<强>查询

+---------------------+---------------------+-------+
|      StartDate      |       EndDate       | Price |
+---------------------+---------------------+-------+
| 2001-01-01 00:00:00 | 2001-01-10 00:00:00 | 11.00 |
| 2010-01-01 00:00:00 | 2010-01-05 00:00:00 | 10.00 |
| 2010-01-05 00:00:00 | 2010-01-10 00:00:00 | 20.00 |
| 2010-01-10 00:00:00 | 2010-01-20 00:00:00 | 20.00 |
| 2010-02-01 00:00:00 | 2010-02-05 00:00:00 | 30.00 |
| 2010-02-05 00:00:00 | 2010-02-20 00:00:00 | 40.00 |
| 2010-03-01 00:00:00 | 2010-03-20 00:00:00 | 60.00 |
| 2010-04-01 00:00:00 | 2010-04-05 00:00:00 | 70.00 |
| 2010-04-05 00:00:00 | 2010-04-15 00:00:00 | 80.00 |
| 2010-04-15 00:00:00 | 2010-04-20 00:00:00 | 70.00 |

<强>结果

| 2014-08-06 00:00:00 | 2014-09-06 00:00:00 | 29.99 |
| 2014-09-06 00:00:00 | 2014-09-10 00:00:00 | 32.99 |
| 2014-09-10 00:00:00 | 2014-10-01 00:00:00 | 32.99 |
| 2014-10-01 00:00:00 | 2014-10-07 00:00:00 | 29.99 |
| 2014-10-07 00:00:00 | 2050-01-01 00:00:00 | 34.99 |
+---------------------+---------------------+-------+

这是您的样本数据:

+---------------------+------+-------+---------+---------------------+-----------+
|         dt          | Flag | Price | SumFlag |      NextDate       | NextPrice |
+---------------------+------+-------+---------+---------------------+-----------+
| 2001-01-01 00:00:00 |    1 | 11.00 |       1 | 2001-01-10 00:00:00 | 11.00     |
| 2001-01-10 00:00:00 |   -1 | 11.00 |       0 | 2010-01-01 00:00:00 | 10.00     |
| 2010-01-01 00:00:00 |    1 | 10.00 |       1 | 2010-01-05 00:00:00 | 20.00     |
| 2010-01-05 00:00:00 |    1 | 20.00 |       2 | 2010-01-10 00:00:00 | 10.00     |
| 2010-01-10 00:00:00 |   -1 | 10.00 |       1 | 2010-01-20 00:00:00 | 20.00     |
| 2010-01-20 00:00:00 |   -1 | 20.00 |       0 | 2010-02-01 00:00:00 | 30.00     |
| 2010-02-01 00:00:00 |    1 | 30.00 |       1 | 2010-02-05 00:00:00 | 40.00     |
| 2010-02-05 00:00:00 |    1 | 40.00 |       2 | 2010-02-20 00:00:00 | 30.00     |
| 2010-02-20 00:00:00 |   -1 | 30.00 |       1 | 2010-02-20 00:00:00 | 40.00     |
| 2010-02-20 00:00:00 |   -1 | 40.00 |       0 | 2010-03-01 00:00:00 | 50.00     |
| 2010-03-01 00:00:00 |    1 | 50.00 |       1 | 2010-03-01 00:00:00 | 60.00     |
| 2010-03-01 00:00:00 |    1 | 60.00 |       2 | 2010-03-20 00:00:00 | 50.00     |
| 2010-03-20 00:00:00 |   -1 | 50.00 |       1 | 2010-03-20 00:00:00 | 60.00     |
| 2010-03-20 00:00:00 |   -1 | 60.00 |       0 | 2010-04-01 00:00:00 | 70.00     |
| 2010-04-01 00:00:00 |    1 | 70.00 |       1 | 2010-04-05 00:00:00 | 80.00     |
| 2010-04-05 00:00:00 |    1 | 80.00 |       2 | 2010-04-15 00:00:00 | 80.00     |
| 2010-04-15 00:00:00 |   -1 | 80.00 |       1 | 2010-04-20 00:00:00 | 70.00     |
| 2010-04-20 00:00:00 |   -1 | 70.00 |       0 | 2014-08-06 00:00:00 | 29.99     |
| 2014-08-06 00:00:00 |    1 | 29.99 |       1 | 2014-09-06 00:00:00 | 32.99     |
| 2014-09-06 00:00:00 |    1 | 32.99 |       2 | 2014-09-10 00:00:00 | 32.99     |
| 2014-09-10 00:00:00 |   -1 | 32.99 |       1 | 2014-09-10 00:00:00 | 32.99     |
| 2014-09-10 00:00:00 |    1 | 32.99 |       2 | 2014-10-01 00:00:00 | 32.99     |
| 2014-10-01 00:00:00 |   -1 | 32.99 |       1 | 2014-10-07 00:00:00 | 29.99     |
| 2014-10-07 00:00:00 |   -1 | 29.99 |       0 | 2014-10-07 00:00:00 | 34.99     |
| 2014-10-07 00:00:00 |    1 | 34.99 |       1 | 2050-01-01 00:00:00 | 34.99     |
| 2050-01-01 00:00:00 |   -1 | 34.99 |       0 | NULL                | NULL      |
+---------------------+------+-------+---------+---------------------+-----------+

CTE_Intervals的中介结果

检查这些以了解查询的工作原理

<!DOCTYPE html>
<html>
<head>
<title></title>
<script type="text/javascript">
function test()
{
alert('hi');
}
</script>

<style>
i {
cursor:pointer;
text-decoration:underline;
}

</style>

</head>
<body>
     <div>
    <i id="1" onClick="return function(){test();}">CLICK HERE1</i>
    <br/>
    <i id="2" onClick="function(){test();}">CLICK HERE2</i>
    <br/>
    <i id="3" onClick="return test();">CLICK HERE3</i>
    </div>
</body>
</html>

答案 2 :(得分:0)

如何使用Lead查找下一行的值:

SELECT START_DATE, 
    CASE 
        WHEN LEAD(Start_Date) OVER (ORDER BY Start_Date) < END_DATE 
        THEN COALESCE(DATEADD(s, -1, LEAD(Start_Date) OVER (ORDER BY Start_Date)), END_Date)
        ELSE END_DATE END AS End_Date,
    GROSS_SALES_PRICE
 FROM @t

或使用公用表表达式:

;WITH CTE
 AS
 (
    SELECT Start_date,
           End_Date,
           LEAD(Start_Date) OVER (ORDER BY Start_Date) AS NextStartDate,
           GROSS_SALES_PRICE
    FROM @t
)
SELECT START_DATE,
        CASE WHEN NextStartDate < END_DATE 
            THEN Coalesce(DATEADD(s, -1, NextStartDate), End_Date) 
            ELSE End_date END As End_Date,
        GROSS_SALES_PRICE
FROM CTE
  

更新以添加缺失的行:

;WITH CTE
 AS
 (
    SELECT Start_date,
           End_Date,
           LAG(END_Date) OVER (ORDER BY Start_Date) AS PreviousEndDate,
           LEAD(Start_Date) OVER (ORDER BY Start_Date) AS NextStartDate,
           GROSS_SALES_PRICE
    FROM @t
)
SELECT START_DATE,
        CASE WHEN NextStartDate < END_DATE 
            THEN Coalesce(DATEADD(s, -1, NextStartDate), End_Date) 
            ELSE End_date END As End_Date,
        GROSS_SALES_PRICE
FROM CTE
UNION ALL
SELECT DATEADD(s, 1, PreviousEndDate), DATEADD(s, -1, Start_Date), GROSS_SALES_PRICE
FROM CTE
WHERE DATEDIFF(s, PreviousEndDate,Start_Date) > 1
ORDER BY 1

答案 3 :(得分:0)

注意:以下解决方案几乎没有假设

[1]它正在使用LEAD功能=&gt; SQL2012 +

[2]所有DATETIME列都是强制性的=&gt; NOT NULL

[3]所有DATETIME值(跨两列)都是唯一的。

select y.*
from (
    select t.ID, x.DT AS NEW_START_DATE, DATEADD(MILLISECOND, -3, LEAD(x.DT) OVER(ORDER BY x.DT ASC)) AS NEW_END_DATE
    from @t as t
    outer apply (
        select t.START_DATE, 1
        union all
        select t.END_DATE, 2 
    ) as x(DT, [TYPE])
) as y
where y.NEW_END_DATE IS NOT NULL
order by y.NEW_START_DATE

答案 4 :(得分:0)

这可以通过简单的连接和联合来解决。不过更好的身份证。公用表表达式仅用于添加ID。

declare @t table(START_DATE datetime,END_DATE datetime, GROSS_SALES_PRICE 
 decimal(10,2));
insert into @t values
 ( '2014-08-06 00:00:00.000',   '2014-10-06 23:59:59.000',  29.99),
 ( '2014-09-06 00:00:00.000',   '2014-09-09 23:59:59.000',  32.99),
 ( '2014-09-10 00:00:00.000',   '2014-09-30 23:59:59.000',  32.99),
 ( '2014-10-07 00:00:00.000',   '2049-12-31 23:59:59.000',  34.99)

;with t_cte as
(select row_number() over( order by start_date,end_date,GROSS_SALES_PRICE) ID,*
from @t
)

select t1.start_date,min(t2.start_date),t1.GROSS_SALES_PRICE
from t_cte t1
join t_cte t2 on t1.END_DATE > t2.START_DATE and t1.END_DATE> t2.START_DATE and t1.id< t2.id
group by t1.START_DATE,t1.END_DATE,t1.GROSS_SALES_PRICE
union all
select min(t2.start_date),t1.end_date,t1.GROSS_SALES_PRICE
from t_cte t1
join t_cte t2 on t1.END_DATE > t2.START_DATE and t1.END_DATE> t2.START_DATE and t1.id< t2.id
group by t1.START_DATE,t1.END_DATE,t1.GROSS_SALES_PRICE
union all
select t1.start_date,t1.END_DATE,t1.GROSS_SALES_PRICE
from t_cte t1
left join t_cte t2 on t1.END_DATE > t2.START_DATE and t1.END_DATE> t2.START_DATE and t1.id< t2.id
where t2.id is null
order by 1,2,3