获取列中值的计数最大的每个用户的行

时间:2015-06-26 10:40:26

标签: sql sql-server

我的列结构:

Column0   Column1
aaa        abc
aaa        abc
aaa        xyx
aaa        NA
bbb        fgh
bbb        NA
bbb        NA
bbb        NA
ccc        NA
ccc        NA
ccc        NA
ccc        NA

我希望获得的是预测不同的'Column0'数据'Column1'数据,其数量为max,除非该数据为NA,在这种情况下获得第二高的数据。 如果对于'Column0'数据,'Column1'的所有值都是NA,那么该值可以是NA

所以期望值:

Column0   Column1
aaa       abc
bbb       fgh
ccc       NA

4 个答案:

答案 0 :(得分:2)

您可以使用两个CTE和排名函数ROW_NUMBER

WITH CTE1 AS
(
    SELECT Column0, Column1, Cnt = COUNT(*) OVER (PARTITION BY Column0, Column1)
    FROM dbo.TableName
)
, CTE2 AS
(
    SELECT Column0, Column1,
           RN = ROW_NUMBER() OVER (PARTITION BY Column0  
                                   ORDER BY CASE WHEN Column1 = 'NA' THEN 1 ELSE 0 END ASC
                                          , Cnt DESC)
    FROM CTE1
)
SELECT Column0, Column1
FROM CTE2
WHERE RN = 1

Demo

答案 1 :(得分:2)

这将给出正确的结果:

DECLARE @t table(Column0  char(3), Column1 varchar(3))
INSERT @t values
('aaa','abc'),('aaa','abc'),('aaa','xyx'),('aaa','NA')
,('bbb','fgh'),('bbb','NA'),('bbb','NA'),('bbb','NA')
,('ccc','NA'),('ccc','NA'),('ccc','NA'),('ccc','NA')

;WITH CTE as
(
  SELECT
    column0, 
    column1, 
    count(case when column1 <> 'NA' THEN 1 end) over (partition by column0, column1) cnt
  FROM @t
), CTE2 as
(
  SELECT 
    column0, 
    column1, 
    row_number() over (partition by column0 order by cnt desc) rn
  FROM CTE
)
SELECT column0, column1
FROM CTE2
WHERE rn = 1

结果:

column0  column1
aaa      abc
bbb      fgh
ccc      NA

答案 2 :(得分:2)

这样的事情怎么样?

select T1.Column0,
       isnull((
              select top(1) T2.Column1
              from dbo.YourTable as T2
              where T1.Column0 = T2.Column0 and
                    T2.Column1 <> 'NA'
              group by T2.Column1
              order by count(*) desc
              ), 'NA') as Column1
from dbo.YourTable as T1
group by T1.Column0

SQL Fiddle

带索引

create index IX_YourTable_Column0 on YourTable(Column0, Column1)

你得到一个漂亮的查询计划。

enter image description here

一个处理Column0中的NULL值的版本。​​

select T1.Column0,
       isnull((
              select top(1) T2.Column1
              from dbo.YourTable as T2
              where exists(select T1.Column0 intersect select T2.Column0) and
                    T2.Column1 <> 'NA'
              group by T2.Column1
              order by count(*) desc
              ), 'NA') as Column1
from dbo.YourTable as T1
group by T1.Column0

此版本的查询计划与上述版本相同。

答案 3 :(得分:0)

您可以将row_number()与聚合一起使用:

select column0, column1
from (select column0, column1,
             row_number() over (partition by column0
                                order by count(*) desc
                               ) as seqnum
      from [table]
      group by column0, column1
     ) t
where seqnum = 1;

如果您想在关系中允许重复,请使用rank()dense_rank()代替row_number()