Please help me with this one, I'm stuck and cant figure out how to write my Query. I'm working with SQL Server 2014.
Table A (approx 65k ROWS) CEID = primary key
CEID State Checksum
1 2 666
2 2 666
3 2 666
4 2 333
5 2 333
6 9 333
7 9 111
8 9 111
9 9 741
10 2 656
Desired output
CEID State Checksum
3 2 666
6 9 333
8 9 111
9 9 741
10 2 656
I want to keep the row with highest CEID if "state" is equal for all duplicate checksums. If state differs but Checksum is equal i want to keep the row with highest CEID for State=9. Unique rows like CEID 9 and 10 should be included in result regardless of State.
This join returns all duplicates:
SELECT a1.*, a2.*
FROM tableA a1
INNER JOIN tableA a2 ON a1.ChecksumI = a2.ChecksumI
AND a1.CEID <> a2.CEID
I've also identified MAX(CEID)
for each duplicate checksum with this query
SELECT a.Checksum, a.State, MAX(a.CEID) CEID_MAX ,COUNT(*) cnt
FROM tableA a
GROUP BY a.Checksum, a.State
HAVING COUNT(*) > 1
ORDER BY a.Checksum, a.State
With the first query, I can't figure out how to SELECT
the row with the highest CEID
per Checksum.
The problem I encounter with last one is that GROUP BY isn't allowed in subqueries when I try to join on it.
答案 0 :(得分:2)
您可以将row_number()
与checksum
进行分区,并按State desc
和CEID desc
进行排序。请注意,ORDER BY State desc, CEID desc
并获取第一个row_number
;with
cte as
(
select *, rn = row_number() over (Partition by Checksum order by State desc, CEID desc)
from TableA
)
select *
from cte
where rn = 1
order by CEID;