SQL Group By Customized Categories(Number)

时间:2017-03-08 13:51:42

标签: sql sql-server reporting-services

我想写一个查询有问题。

我有一个表包含文件及其大小(以字节为单位)。它看起来像这样:

FileUrl | FileSize
------------------
xyz.docx | 2794496
qwe.ppt | 655360
asd.pdf | 1388782
...
...

我想要的是根据我将定义的不同大小组查找文件数,总文件数的百分比和文件总大小的百分比。所以它应该是这样的:

Size Category | Number of Files | % of Total File Count | ½ of Total File Size
------------------------------------------------------------------------------
0-1 MB        | 235             | 80%                   | 20%
1-10 MB       | 57              | 20%                   | 80%
10-50 MB
...
...

创建此类群组然后找到这些百分比的最佳方法是什么?我无法提出解决方案,而且我的在线搜索根本没有帮助。

提前谢谢

7 个答案:

答案 0 :(得分:2)

用例和CTE

with CTE as
(
select case 
           when filesize < 1024 then '0 - 1 MB'
           when ...
           else '50MB+'
       end as FileGroup,
       Filedata.*
from Filedata
)
select FileGroup, 
       count(Filename) as NumberOfFiles, 
       count(filename) / (select count(*) from CTE) as PCTotalCount, 
       sum(Filesize) / (select sum(filesize) from CTE) as PCTotalSize
from CTE
group by FileGroup

答案 1 :(得分:2)

以下是使用apply和窗口函数的一种方法:

select v.sizecategory, count(*) as numfiles,
       (count(*) / sum(1.0 * count(*)) over () as ratio_files,
       (sum(filesize) / sum(sum(filesize) * 1.0)) over () as ratio_sizes
from t outer apply
     (values (case when t.filesize < 1000000 then '0-1 MByte'
                   when t.filesize < 10000000 then '1-10 MByte'
                   when t.filesize < 50000000 then '10-50 MByte'
                   . . .
              end)
     ) v(sizecategory) 
group by v.sizecategory
order by min(t.filesize);

答案 2 :(得分:1)

或者你也可以这样做:

SELECT
CASE 
WHEN FileSize < 1024 THEN '0-1 MB'
WHEN FileSize  FileSize < 10240 THEN '1-10 MB'
WHEN FileSize  FileSize < 51200 THEN '10-50 MB'
-- continue with this...
END 'SizeCategory',
count (fileUrl) as 'Number of Files',
(count (fileUrl)) / (sum(count (fileUrl)) over (order by null)) as '% of Total File Count',
(sum(FileSize)) / (sum(FileSize) over (order by null)) as '½ of Total File Size'
FROM table
GROUP BY SizeCategory

答案 3 :(得分:1)

你要的是直方图。这可能会有所帮助:

SELECT FLOOR((filesize*1000000)/10.00)*10 As SizeCategory, 
       COUNT(*) AS [NumFiles]
FROM TableName
GROUP BY FLOOR((filesize*1000000)/10.00)*10 
ORDER BY 1

这会给你相等的间隔。

答案 4 :(得分:1)

使用以下范围创建您的groups表:

RangeLow | RangeHigh | Number of Files | PercentCount | ½ of Total 
------------------------------------------------------------------------------
    0    |  1024000  | 235             | 80%                   | 20%
1024000  |  2048000  | 57              | 20%                   | 80%

然后您的查询可能如下所示:

Select FileUrl, FileSize, (Select PercentCount  From GroupTable Where FileSize >= RangeLow AND FileSize < RangeHigh )  From FilesTable 

答案 5 :(得分:1)

如果无法调整数据集SQL,则可以使用SSRS报告中的表达式对文件大小进行分类。在行标题单元格和行组的Group on表达式中使用如下表达式:

=Switch(Fields!FILESIZE.Value < 1024, "0-1 MB",
Fields!FILESIZE.Value < 10240, "1-10 MB",
Fields!FILESIZE.Value < 51200, "10-50 MB",
True, "50+ MB")

然后,您也可以计算报告中的总数。使用Count()Sum函数的第二个参数来定义范围:

=Count(Fields!FILESIZE.Value)
=Count(Fields!FILESIZE.Value, "RowGroup") / Count(Fields!FILESIZE.Value, "DataSet1")
=Sum(Fields!FILESIZE.Value, "RowGroup") / Sum(Fields!FILESIZE.Value, "DataSet1")

答案 6 :(得分:1)

这里有许多完全有效的答案。 但是,我更喜欢维护一个通用的 Tier Table ,可以为多个主服务器提供服务,并从代码中删除逻辑,并提供更多的灵活性。

示例

Declare @Tier table (Tier_Grp varchar(50),Tier_Seq int,Tier_Dsc varchar(100),Tier_R1 float,Tier_R2 float)
Insert Into @Tier values
('File Size',1  ,'0-1 MB'   ,0    ,1e+6 ),
('File Size',2  ,'1-10 MB'  ,1e+6 ,1e+7 ),
('File Size',3  ,'10-50 MB' ,1e+7 ,5e+7 ),
('File Size',4  ,'50-100 MB',5e+7 ,1e+8 ),
('File Size',5  ,'100 MB +' ,1e+8 ,9e+11 ),
('File Size',999,'Total'    ,0    ,9e+12 )  --< Optional

Declare @YourTable table (FileUrl varchar(50),FileSize int)
Insert Into @YourTable values
('xyz.docx',2794496),
('qwe.ppt',655360),
('asd.pdf',1388782)

;with cte as (
    Select A.* 
          ,Cnt   = count(FileSize)+0.0
          ,Total = isnull(sum(cast(FileSize as float)),0)
     From  @Tier A
     Left Join  @YourTable B on Tier_Grp='File Size' and B.FileSize >=Tier_R1 and B.FileSize<Tier_R2
     Group By Tier_Grp
             ,Tier_Seq
             ,Tier_Dsc
             ,Tier_R1
             ,Tier_R2
)
Select Tier_Grp
      ,Tier_Dsc
      ,[Number of Files]  = cast(Cnt as int)
      ,[Number of Bytes]  = Total 
      ,[Percent of Files] = format(Cnt/max(Cnt) over (),'0.0%')
      ,[Percent of Size]  = format(Total/max(Total) over (),'0.0%')
 From cte
 Order by Tier_Seq

<强>返回

enter image description here