我有一个表数据
+--------------------------------------+--------------------------------------+-----------+--------+
| conversationid | participantid | mediatype | rownum |
+--------------------------------------+--------------------------------------+-----------+--------+
| 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | callback | 1 |
| 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | voice | 2 |
| 0519386a-2b8f-403c-b65b-fd8cc3c09a32 | b033fe2f-a58c-4973-8006-54561b5a5bf7 | voice | 1 |
| 085adea7-1deb-45d8-ae61-639255a689ce | 4151d364-5740-4dcf-b756-9772efaacb26 | voice | 1 |
| 0c50e9c5-cbe0-4da5-9a8c-976efea255b2 | 8f1ee999-8454-4db9-8c24-68e773350f39 | callback | 1 |
| 138da3c8-c118-4ddf-b294-57301261eb97 | cf2b643e-b07f-46c8-a52c-0cb5492e6485 | voice | 1 |
| 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | callback | 1 |
| 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | voice | 2 |
+--------------------------------------+--------------------------------------+-----------+--------+
我正在尝试根据特定的列值(媒体类型)删除重复的数据。
仅当对话ID和参与ID组合有两个值(语音,回叫)时,才需要排除中介类型为“语音”的行。
我尝试了代码
;with cte as(select conversationid,participantid ,
mediatype,row_number() over (partition by conversationid,participantid order by mediatype )
as rownum from #temp
)select * from cte where rownum=1
order by conversationid,participantid ,mediatype
但是结果不是基于特定值,而是排除了基于字母的结果。我需要一个条件,当重复的行具有mediatype作为语音时,应排除该行。对于没有回调值的其他唯一行,应返回mediatype语音。
答案 0 :(得分:2)
如果还没有mediatype = 'callback'
,则希望所有带有mediatype = 'voice'
或mediatype = 'callback'
的行。
在WHERE子句中应用以下条件:
select * from tablename t
where
mediatype = 'callback'
or
not exists (
select 1 from tablename
where conversationid = t.conversationid and participantid = t.participantid
and mediatype <> t.mediatype
)
请参见demo。
结果:
> conversationid | participantid | mediatype | rownum
> :----------------------------------- | :----------------------------------- | :-------- | -----:
> 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | callback | 1
> 0519386a-2b8f-403c-b65b-fd8cc3c09a32 | b033fe2f-a58c-4973-8006-54561b5a5bf7 | voice | 1
> 085adea7-1deb-45d8-ae61-639255a689ce | 4151d364-5740-4dcf-b756-9772efaacb26 | voice | 1
> 0c50e9c5-cbe0-4da5-9a8c-976efea255b2 | 8f1ee999-8454-4db9-8c24-68e773350f39 | callback | 1
> 138da3c8-c118-4ddf-b294-57301261eb97 | cf2b643e-b07f-46c8-a52c-0cb5492e6485 | voice | 1
> 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | callback | 1
答案 1 :(得分:0)
实际上我不清楚您的原始查询为什么不起作用的原因(因为在没有“回调”的情况下,“语音”行将排在第一位),但是当这些是您仅有的三个列值时,这是另一种工作方式结果中需要:
select conversationid, participant, min(mediatype) as mediatype
from #temp
group by conversationid, participantid;
如果您不想依赖字母顺序,请使用大小写表达式:
select conversationid, participant,
case min(case mediatype when 'callback' then 1 when 'voice' then 2 end)
when 1 then 'callback' when 2 then 'voice' end as mediatype
from #temp
group by conversationid, participantid;