我试图将此表分成3个分区,并创建一个列,该行包含该分区。此表通过添加新行来保存有关文档的历史数据,并为旧行设置IsDeleted = 1
。您可以看到文档的每个修订都删除旧版本的所有行,并使用新行号重新创建它。
我不确定从哪里开始,因为我以前没有使用过分区条款,我们非常感谢任何帮助。
当前表:
+----+----------------+------------+-----------+-------------------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate |
+----+----------------+------------+-----------+-------------------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.222 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.111 |
+----+----------------+------------+-----------+-------------------------+
预期结果:
+----+----------------+------------+-----------+-------------------------+-----------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate | PartitionNumber |
+----+----------------+------------+-----------+-------------------------+-----------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.222 | 3 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.111 | 3 |
+----+----------------+------------+-----------+-------------------------+-----------------+
更新
除了Jason的回答,我添加了一个分区by子句,以便重置我表格中每个文档的排名。我希望这有助于将来的某个人。
SELECT ID,
DocumentNumber,
LineNumber,
IsDeleted,
CreatedDate,
SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END)
OVER (PARTITION BY DocumentNumber ORDER BY CreatedDate)
AS 'PartitionNumber'
FROM CurrentTable
答案 0 :(得分:1)
通过这样做,我得到了你想要的东西:
SELECT ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate,
SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END)
OVER (ORDER BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate)
AS 'PartitionNumber'
FROM CurrentTable
GROUP BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate
我使用SUM
和CASE
为所有行号1分配值1,为其他行分配0。然后我用一个窗口函数来计算一个运行总计。
<强>结果:强>
+----+----------------+------------+-----------+-------------------------+----------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate | PartitionNumber|
+----+--- ------------+------------+-----------+-------------------------+----------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.223 | 3 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.110 | 3 |
+----+--- ------------+----------------------------------- --------------+----------------+
答案 1 :(得分:1)
每个分区的createdDate是否相同......就像在分区3中一样,它是不同的。如果它相同则可以使用DENSE_Rank()
SELECT *,
DENSE_RANK() OVER(PARTITION BY documentNumber,CreatedDate ORDER BY documentNumber,CreatedDate ) as PartitionNumber
FROM Table
答案 2 :(得分:0)
我想我会跟着你。下面给出了你想要的东西,但如果数据中有更多,它会进入比3更多的分区,我认为是预期的。
if object_id('tempdb.dbo.#test') is not null drop table #test
create table #test
(
id int,
linenumber int,
isdeleted bit,
createddate datetime,
documentnumber varchar(50)
)
insert into #test
select 1 , 1 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 2 , 2 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 3 , 3 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 4 , 4 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 5 , 1 , 1 , '2017-01-21 12:11:14.500', 'D001'
union all select 6 , 2 , 1 , '2017-01-21 12:11:14.500', 'D001'
union all select 7 , 1 , 0 , '2017-01-21 15:20:20.222', 'D001'
union all select 8 , 2 , 0 , '2017-01-21 15:21:21.111', 'D001'
select
*,
DENSE_RANK() over (partition by documentNumber order by isdeleted desc, case when isdeleted=0 then getdate() else createddate end) as partitionValues
from #test