我有如下表格:
id | col1 | col2 | col3 | col4
---+------+------+--------+-----------
1 | abc | 23 | data1 | otherdata1
2 | def | 41 | data2 | otherdata2
3 | ghi | 41 | data3 | otherdata3
4 | jkl | 58 | data4 | otherdata4
5 | mno | 23 | data1 | otherdata5
6 | pqr | 41 | data3 | otherdata6
7 | stu | 76 | data2 | otherdata7
如何快速选择col2 + col3没有重复的行?表中有超过1500万行,因此加入可能不合适。
最终结果应如下所示:
id | col1 | col2 | col3 | col4
---+------+------+--------+-----------
2 | def | 41 | data2 | otherdata2
4 | jkl | 58 | data4 | otherdata4
7 | stu | 76 | data2 | otherdata7
答案 0 :(得分:4)
不确定这会有多快,但这应该有效:
select id, col1, col2, col3, col4
from (
select id, col1, col2, col3, col4,
count(*) over (partition by col2, col3) as cnt
from the_table
) t
where cnt = 1
order by id;
答案 1 :(得分:3)
窗口功能绝对是一种可能性。但是,如果你关心性能,也值得尝试另一种方法并比较速度。
脑海中浮现出 NOT EXISTS
:
select t.*
from table t
where not exists (select 1
from table t2
where t2.col2 = t.col2 and t2.col3 = t.col3 and
t2.id <> t.id
);
这可以利用table(col2, col3)
上的索引。
答案 2 :(得分:1)
试试这个..
select * from
(
select id,col1,col2,col3,col4
,row_number() over (partition by col2,col3 order by col2,col3 desc ) as rnm
from
table
) x where rnm =1;