我的列结构:
Column0 Column1
aaa abc
aaa abc
aaa xyx
aaa NA
bbb fgh
bbb NA
bbb NA
bbb NA
ccc NA
ccc NA
ccc NA
ccc NA
我希望获得的是预测不同的'Column0'数据'Column1'数据,其数量为max,除非该数据为NA,在这种情况下获得第二高的数据。 如果对于'Column0'数据,'Column1'的所有值都是NA,那么该值可以是NA
所以期望值:
Column0 Column1
aaa abc
bbb fgh
ccc NA
答案 0 :(得分:2)
您可以使用两个CTE和排名函数ROW_NUMBER
:
WITH CTE1 AS
(
SELECT Column0, Column1, Cnt = COUNT(*) OVER (PARTITION BY Column0, Column1)
FROM dbo.TableName
)
, CTE2 AS
(
SELECT Column0, Column1,
RN = ROW_NUMBER() OVER (PARTITION BY Column0
ORDER BY CASE WHEN Column1 = 'NA' THEN 1 ELSE 0 END ASC
, Cnt DESC)
FROM CTE1
)
SELECT Column0, Column1
FROM CTE2
WHERE RN = 1
答案 1 :(得分:2)
这将给出正确的结果:
DECLARE @t table(Column0 char(3), Column1 varchar(3))
INSERT @t values
('aaa','abc'),('aaa','abc'),('aaa','xyx'),('aaa','NA')
,('bbb','fgh'),('bbb','NA'),('bbb','NA'),('bbb','NA')
,('ccc','NA'),('ccc','NA'),('ccc','NA'),('ccc','NA')
;WITH CTE as
(
SELECT
column0,
column1,
count(case when column1 <> 'NA' THEN 1 end) over (partition by column0, column1) cnt
FROM @t
), CTE2 as
(
SELECT
column0,
column1,
row_number() over (partition by column0 order by cnt desc) rn
FROM CTE
)
SELECT column0, column1
FROM CTE2
WHERE rn = 1
结果:
column0 column1
aaa abc
bbb fgh
ccc NA
答案 2 :(得分:2)
这样的事情怎么样?
select T1.Column0,
isnull((
select top(1) T2.Column1
from dbo.YourTable as T2
where T1.Column0 = T2.Column0 and
T2.Column1 <> 'NA'
group by T2.Column1
order by count(*) desc
), 'NA') as Column1
from dbo.YourTable as T1
group by T1.Column0
带索引
create index IX_YourTable_Column0 on YourTable(Column0, Column1)
你得到一个漂亮的查询计划。
一个处理Column0
中的NULL值的版本。
select T1.Column0,
isnull((
select top(1) T2.Column1
from dbo.YourTable as T2
where exists(select T1.Column0 intersect select T2.Column0) and
T2.Column1 <> 'NA'
group by T2.Column1
order by count(*) desc
), 'NA') as Column1
from dbo.YourTable as T1
group by T1.Column0
此版本的查询计划与上述版本相同。
答案 3 :(得分:0)
您可以将row_number()
与聚合一起使用:
select column0, column1
from (select column0, column1,
row_number() over (partition by column0
order by count(*) desc
) as seqnum
from [table]
group by column0, column1
) t
where seqnum = 1;
如果您想在关系中允许重复,请使用rank()
或dense_rank()
代替row_number()
。