我使用的是SQL Server 2016,我遇到了多个col分组的问题,并在省略重复行的同时找到了平均值。我有一个事务表定义为:
CREATE TABLE [dbo].[CUST_TRANSACTION](
[EXTRACT_DATE] [date] NULL,
[CUSTOMER_ID] [bigint] NULL,
[TRANS_NUMBER] [bigint] NULL,
[CATEGORY] [smallint] NULL,
[RANKING] [smallint] NULL )
以下是一些数据:
EXTRACT_DATE CUSTOMER_ID TRANS_NUMBER CATEGORY RANKING
10/31/2017 10001 1000101 4 100
10/31/2017 10001 1000102 4 100
10/31/2017 10002 1000201 4 200
10/31/2017 10001 1000103 5 100
10/31/2017 10003 1000301 5 300
10/31/2017 10003 1000302 5 300
10/31/2017 10004 1000401 7 500
10/31/2017 10001 1000104 8 100
Customer_Id AND TRANS_NUMBER组合必须是唯一的,但customer_id可以包含1到多个Trans_Numbers,而Customer_Id可以存在于1到多个类别中。根据我查看的数据,对于给定的EXTRACT_DATE,Customer_ID的排名似乎相同。我在排名中找不到NULLS,但我确实找到了零,所以我需要从平均值中排除任何零。
请求是生成按每个类别(1 - 15)细分的报告,并查找该类别中的平均排名,但只计算一次customer_id,并找到该类别的最大排名。这是给定的EXTRACT_Date。
所以我运行了以下内容:
Select CATEGORY, MAX(RANKING) "Max Ranking", AVG(RANKING) "Average Ranking"
from CUST_TRANSACTION
where EXTRACT_DATE = Convert(datetime, '2017-10-31' )
and RANKING > 1
group by CATEGORY
order by CATEGORY
生成以下输出:
CATEGORY Max Ranking Average Ranking
4 200 133
5 300 233
7 500 500
8 100 100
但是类别4的平均值应该是150,因为customer_Id = 10001有两个条目,而类别5应该是= 200,因为Customer_id 10003有两个条目。
当我尝试按类别Customer_Id进行分组时,输出包括Category和Customer_Id的每个组合,这就是Group by的功能。所以我不确定我是否需要子选择或任何其他想法?
由于
答案 0 :(得分:1)
看起来你不关心trans_number映射,所以你可以删除它并在派生表中选择不同的剩余值:
Select CATEGORY, MAX(RANKING) "Max Ranking", AVG(RANKING) "Average Ranking"
from ( select distinct [EXTRACT_DATE] ,
[CUSTOMER_ID] ,
[CATEGORY] ,
[RANKING] from CUST_TRANSACTION )CUST_TRANSACTION
where EXTRACT_DATE = Convert(datetime, '2017-10-31' )
and RANKING > 1
group by CATEGORY
order by CATEGORY
答案 1 :(得分:0)
您可以使用公用表表达式(CTE)过滤掉类别中的重复customerID。这样的事情。
;with cte as (
select CATEGORY, RANKING, EXTRACT_DATE
ROW_NUMBER() over(partition by category, customer_id order by customer_id) rn
from CUST_TRANSACTION
)
Select CATEGORY, MAX(RANKING) "Max Ranking", AVG(RANKING) "Average Ranking"
from cte --CUST_TRANSACTION
where EXTRACT_DATE = Convert(datetime, '2017-10-31' )
and RANKING > 1
and rn = 1
group by CATEGORY
order by CATEGORY
答案 2 :(得分:0)
由于总体平均值和最大值的要求不同,您无法使用单个列来获取两者。子选择将提供一列用于平均,另一列用于最大化。
DECLARE @QUERY_DATE DATE = '2017-10-31';
Select
CATEGORY
, MAX(RANKING_detail_max) "Max Ranking"
, AVG(RANKING_detail_sum) "Average Ranking"
from (
select CATEGORY
, CUSTOMER_ID
, SUM(RANKING) RANKING_detail_sum
, MAX(RANKING) RANKING_detail_max
from CUST_TRANSACTION
where EXTRACT_DATE = @QUERY_DATE
and RANKING > 0
group by CATEGORY, CUSTOMER_ID
) rollup
group by CATEGORY
order by CATEGORY