如何按范围

时间:2017-07-10 19:20:37

标签: tsql sql-server-2016

我试图将此表分成3个分区,并创建一个列,该行包含该分区。此表通过添加新行来保存有关文档的历史数据,并为旧行设置IsDeleted = 1。您可以看到文档的每个修订都删除旧版本的所有行,并使用新行号重新创建它。

我不确定从哪里开始,因为我以前没有使用过分区条款,我们非常感谢任何帮助。

当前表:

+----+----------------+------------+-----------+-------------------------+
| ID | DocumentNumber | LineNumber | IsDeleted |       CreatedDate       |
+----+----------------+------------+-----------+-------------------------+
|  1 | D001           |          1 |         1 | 2017-01-20 14:10:13.533 |
|  2 | D001           |          2 |         1 | 2017-01-20 14:10:13.533 |
|  3 | D001           |          3 |         1 | 2017-01-20 14:10:13.533 |
|  4 | D001           |          4 |         1 | 2017-01-20 14:10:13.533 |
|  5 | D001           |          1 |         1 | 2017-01-21 12:11:14.500 |
|  6 | D001           |          2 |         1 | 2017-01-21 12:11:14.500 |
|  7 | D001           |          1 |         0 | 2017-01-21 15:20:20.222 |
|  8 | D001           |          2 |         0 | 2017-01-21 15:21:21.111 |
+----+----------------+------------+-----------+-------------------------+

预期结果:

+----+----------------+------------+-----------+-------------------------+-----------------+
| ID | DocumentNumber | LineNumber | IsDeleted |       CreatedDate       | PartitionNumber |
+----+----------------+------------+-----------+-------------------------+-----------------+
|  1 | D001           |          1 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  2 | D001           |          2 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  3 | D001           |          3 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  4 | D001           |          4 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  5 | D001           |          1 |         1 | 2017-01-21 12:11:14.500 |               2 |
|  6 | D001           |          2 |         1 | 2017-01-21 12:11:14.500 |               2 |
|  7 | D001           |          1 |         0 | 2017-01-21 15:20:20.222 |               3 |
|  8 | D001           |          2 |         0 | 2017-01-21 15:21:21.111 |               3 |
+----+----------------+------------+-----------+-------------------------+-----------------+

更新

除了Jason的回答,我添加了一个分区by子句,以便重置我表格中每个文档的排名。我希望这有助于将来的某个人。

SELECT ID,
       DocumentNumber,
       LineNumber,
       IsDeleted,
       CreatedDate,
       SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END) 
       OVER (PARTITION BY DocumentNumber ORDER BY CreatedDate) 
       AS 'PartitionNumber'
FROM CurrentTable

3 个答案:

答案 0 :(得分:1)

通过这样做,我得到了你想要的东西:

SELECT ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate, 
       SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END) 
       OVER (ORDER BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate) 
       AS 'PartitionNumber'
FROM CurrentTable
GROUP BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate

我使用SUMCASE为所有行号1分配值1,为其他行分配0。然后我用一个窗口函数来计算一个运行总计。

<强>结果:

+----+----------------+------------+-----------+-------------------------+----------------+
| ID | DocumentNumber | LineNumber | IsDeleted |     CreatedDate         | PartitionNumber|
+----+--- ------------+------------+-----------+-------------------------+----------------+
| 1  |     D001       |     1      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 2  |     D001       |     2      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 3  |     D001       |     3      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 4  |     D001       |     4      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 5  |     D001       |     1      |    1      | 2017-01-21 12:11:14.500 |       2        |
| 6  |     D001       |     2      |    1      | 2017-01-21 12:11:14.500 |       2        |
| 7  |     D001       |     1      |    0      | 2017-01-21 15:20:20.223 |       3        |
| 8  |     D001       |     2      |    0      | 2017-01-21 15:21:21.110 |       3        |
+----+--- ------------+----------------------------------- --------------+----------------+

答案 1 :(得分:1)

每个分区的createdDate是否相同......就像在分区3中一样,它是不同的。如果它相同则可以使用DENSE_Rank()

SELECT *,
DENSE_RANK() OVER(PARTITION BY documentNumber,CreatedDate ORDER BY documentNumber,CreatedDate ) as PartitionNumber
FROM Table

答案 2 :(得分:0)

我想我会跟着你。下面给出了你想要的东西,但如果数据中有更多,它会进入比3更多的分区,我认为是预期的。

if object_id('tempdb.dbo.#test') is not null drop table #test
create table #test
(
    id int,
    linenumber int,
    isdeleted bit,
    createddate datetime,
    documentnumber varchar(50)
)

insert into #test
select 1 ,  1 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 2 ,  2 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 3 ,  3 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 4 ,  4 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 5 ,  1 ,         1 , '2017-01-21 12:11:14.500', 'D001'
union all select 6 ,  2 ,         1 , '2017-01-21 12:11:14.500', 'D001'
union all select 7 ,  1 ,         0 , '2017-01-21 15:20:20.222', 'D001'
union all select 8 ,  2 ,         0 , '2017-01-21 15:21:21.111', 'D001'


select 
    *, 
    DENSE_RANK() over (partition by documentNumber order by isdeleted desc, case when isdeleted=0 then getdate() else createddate end) as partitionValues
from #test