T-SQL-对运行总计进行分区

时间:2018-10-15 02:21:20

标签: sql sql-server tsql window-functions rank

我编写了一个查询,返回以mb为单位的单个记录的大小。这些记录包含Blob数据。

我想将记录分成50mb个批次。

SELECT SourceId, Title, Description,
           SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY DATALENGTH(SourceId) ORDER BY SourceId) AS RunningTotal,
           RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)

此查询当前返回的数据如下所示,其中RunningTotal是记录总数(以mb为单位):

SourceId            Title                                           RunningTotalRowNo
00Pf4000006gna3EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf    5.242880    1
00Pf4000006gna8EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc    6.291456    2
00Pf4000006gnacEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf    7.340032    3
00Pf4000006gnaDEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc    12.582912   4
00Pf4000006gnahEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf    17.825792   5
00Pf4000006gnaIEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc    23.068672   6
00Pf4000006gnamEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf    33.554432   7
00Pf4000006gnaNEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt    34.603008   8
00Pf4000006gnarEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc    35.651584   9
00Pf4000006gnaSEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt    40.894464   10
00Pf4000006gnawEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc    46.137344   11
00Pf4000006gnaXEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt 51.380224   12
00Pf4000006gnb1EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc    61.865984   13
00Pf4000006gnb6EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt    62.914560   14
00Pf4000006gnbaEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc    68.157440   15
00Pf4000006gnbBEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt    78.643200   16
00Pf4000006gnbfEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc    89.128960   17
00Pf4000006gnbGEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf 90.177536   18
00Pf4000006gnbkEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt    91.226112   19
00Pf4000006gnbLEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf    96.468992   20
00Pf4000006gnbpEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt    101.711872  21
00Pf4000006gnbQEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf    112.197632  22
00Pf4000006gnbuEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt    122.683392  23
00Pf4000006gnbVEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc    123.731968  24
00Pf4000006gnbzEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf 124.780544  25
00Pf4000006gnc4EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf 130.023424  26
00Pf4000006gnc9EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf    140.509184  27
00Pf4000006gncdEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt    145.752064  28
00Pf4000006gncEEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc    146.800640  29
00Pf4000006gnciEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt 157.286400  30
00Pf4000006gncJEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc    162.529280  31
00Pf4000006gncKEAQ  001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt    173.015040  32
00Pf4000006gncnEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf    174.063616  33
00Pf4000006gncsEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf    179.306496  34
00Pf4000006gncTEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc    189.792256  35
00Pf4000006gncxEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf    200.278016  36
00Pf4000006gncYEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt    201.326592  37
00Pf4000006gnd2EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc    202.375168  38
00Pf4000006gnd7EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc    207.618048  39
00Pf4000006gndbEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf    212.860928  40
00Pf4000006gndCEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc    223.346688  41
00Pf4000006gndgEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf    233.832448  42
00Pf4000006gnDhEAI  Snake_River_(5mb).jpg   239.077777  43
00Pf4000006gndHEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt 240.126353  44
00Pf4000006gndlEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc    241.174929  45
00Pf4000006gndMEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt 246.417809  46
00Pf4000006gndqEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc    251.660689  47
00Pf4000006gnDrEAI  Pizigani_1367_Chart_10MB.jpg    261.835395  48
00Pf4000006gndREAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt    272.321155  49
00Pf4000006gndvEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc    282.806915  50
00Pf4000006gnDwEAI  Spinner_Dolphin_Indian_Ocean_07-2017.jpg    284.109019  51
00Pf4000006gndWEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf    285.157595  52
00Pf4000006gnDXEAY  440 Kb.jpg  285.609143  53
00Pf4000006gne0EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt    286.657719  54
00Pf4000006gne5EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt 291.900599  55
00Pf4000006gneaEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt    302.386359  56
00Pf4000006gneAEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt 312.872119  57
00Pf4000006gneeEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc    323.357879  58

我希望结果看起来像这样,将它们分成50mb批:

  SourceId          Title                                           RunningTotalRowNo Batch
00Pf4000006gna3EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf    5.242880    1     1
00Pf4000006gna8EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc    6.291456    2     1
00Pf4000006gnacEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf    7.340032    3     1
00Pf4000006gnaDEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc    12.582912   4     1
00Pf4000006gnahEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf    17.825792   5     1
00Pf4000006gnaIEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc    23.068672   6     1
00Pf4000006gnamEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf    33.554432   7     1
00Pf4000006gnaNEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt    34.603008   8     1
00Pf4000006gnarEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc    35.651584   9     1
00Pf4000006gnaSEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt    40.894464   10    1
00Pf4000006gnawEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc    46.137344   11    1
00Pf4000006gnaXEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt     51.380224   12    1
00Pf4000006gnb1EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc    61.865984   13    2
00Pf4000006gnb6EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt    62.914560   14    2
00Pf4000006gnbaEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc    68.157440   15    2
00Pf4000006gnbBEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt    78.643200   16    2
00Pf4000006gnbfEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc    89.128960   17    2
00Pf4000006gnbGEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf     90.177536   18    2
00Pf4000006gnbkEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt    91.226112   19    2
00Pf4000006gnbLEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf    96.468992   20    2
00Pf4000006gnbpEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt    101.711872  21    3
00Pf4000006gnbQEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf    112.197632  22    3
00Pf4000006gnbuEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt    122.683392  23    3
00Pf4000006gnbVEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc    123.731968  24    3
00Pf4000006gnbzEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf     124.780544  25    3
00Pf4000006gnc4EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf    130.023424   26    3
00Pf4000006gnc9EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf    140.509184  27    3
00Pf4000006gncdEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt    145.752064  28    3
00Pf4000006gncEEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc    146.800640  29    3
00Pf4000006gnciEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt    157.286400   30    4
00Pf4000006gncJEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc    162.529280  31    4
00Pf4000006gncKEAQ  001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt    173.015040  32    4
00Pf4000006gncnEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf    174.063616  33    4
00Pf4000006gncsEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf    179.306496  34    4
00Pf4000006gncTEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc    189.792256  35    4
00Pf4000006gncxEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf    200.278016  36    5
00Pf4000006gncYEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt    201.326592  37    5
00Pf4000006gnd2EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc    202.375168  38    5
00Pf4000006gnd7EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc    207.618048  39    5
00Pf4000006gndbEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf    212.860928  40    5
00Pf4000006gndCEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc    223.346688  41    5
00Pf4000006gndgEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf    233.832448  42    5
00Pf4000006gnDhEAI  Snake_River_(5mb).jpg                          239.077777   43    5
00Pf4000006gndHEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt    240.126353   44    5
00Pf4000006gndlEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc    241.174929  45    5
00Pf4000006gndMEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt   246.417809    46    5
00Pf4000006gndqEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc    251.660689  47    6
00Pf4000006gnDrEAI  Pizigani_1367_Chart_10MB.jpg                   261.835395   48    6
00Pf4000006gndREAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt    272.321155  49    6
00Pf4000006gndvEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc    282.806915  50    6
00Pf4000006gnDwEAI  Spinner_Dolphin_Indian_Ocean_07-2017.jpg       284.109019   51    6
00Pf4000006gndWEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf    285.157595  52    6
00Pf4000006gnDXEAY  440 Kb.jpg  285.609143  53
00Pf4000006gne0EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt    286.657719  54    6
00Pf4000006gne5EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt    291.900599   55    6
00Pf4000006gneaEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt    302.386359  56    7
00Pf4000006gneAEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt    312.872119   57    7
00Pf4000006gneeEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc   323.357879   58    7

非常感谢您的帮助。

3 个答案:

答案 0 :(得分:0)

您可以使用整数除法:

SELECT ( CAST ( SUM(Datalength(versiondata) * 0.000001) 
                 OVER ( 
                   partition BY Datalength(sourceid) 
                   ORDER BY sourceid) AS INT) / 50 ) + 1 AS Batch 
FROM   TargetContentVersion 

以下是一个简单的示例,演示了其工作原理:

CREATE TABLE #t (id INT IDENTITY(1,1), size NUMERIC(8,6))
GO

INSERT INTO #t 
SELECT RAND() * 20
GO 20 -- Create 20 sample rows with random sizes between 0 and 20

SELECT id, SUM(size) OVER (ORDER BY id) AS RunningTotal, 
    (CAST(SUM(size) OVER (ORDER BY id) AS INT) / 50) + 1 AS Batch
FROM #t

id  RunningTotal    Batch
1   2.303367        1
2   4.049776        1
3   19.177784       1
4   28.637981       1
5   29.675840       1
6   32.781603       1
7   33.859586       1
8   36.633733       1
9   39.413363       1
10  58.004502       2
11  70.363837       2
12  82.897268       2
13  83.946657       2
14  85.623044       2
15  87.432670       2
16  103.304830      3
17  103.709745      3
18  122.165664      3
19  126.554616      3
20  128.019929      3

答案 1 :(得分:0)

我已经解决了。 下面的脚本供那些感兴趣的人使用。

 WITH cte1 AS (
SELECT SourceId, Title, DATALENGTH(VersionData) * 0.000001 AS RecordSize,
           CAST(SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY 
DATALENGTH(SourceId) ORDER BY SourceId) AS INT) AS RunningTotal,       
           RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)
)
SELECT SourceId, Title, RecordSize, RunningTotal, 
    RowNo, SUM(RunningTotal) OVER (PARTITION BY SourceId ORDER BY SourceId) / 50 AS 
Batch
FROM cte1

答案 2 :(得分:0)

另一种选择是使用dense_rank

WITH CTE AS
(
    SELECT SourceId, Title, Description,
           SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY DATALENGTH(SourceId) ORDER BY SourceId) AS RunningTotal,
           RANK() OVER(ORDER BY SourceId) AS RowNo
    FROM TargetContentVersion WITH(NOLOCK)
)

SELECT  SourceId, Title, Description, RunningTotal, RowNo
        DENSE_RANK() OVER(PARTITION BY SourceId ORDER BY CAST(RunningTotal as int) / 50) As Batch
from #CTE

请注意将RunningTotal强制转换为整数。