根据特定的列值删除重复的行

时间:2019-08-16 21:00:41

标签: sql sql-server

我有一个表数据

+--------------------------------------+--------------------------------------+-----------+--------+
|            conversationid            |            participantid             | mediatype | rownum |
+--------------------------------------+--------------------------------------+-----------+--------+
| 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | callback  |      1 |
| 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | voice     |      2 |
| 0519386a-2b8f-403c-b65b-fd8cc3c09a32 | b033fe2f-a58c-4973-8006-54561b5a5bf7 | voice     |      1 |
| 085adea7-1deb-45d8-ae61-639255a689ce | 4151d364-5740-4dcf-b756-9772efaacb26 | voice     |      1 |
| 0c50e9c5-cbe0-4da5-9a8c-976efea255b2 | 8f1ee999-8454-4db9-8c24-68e773350f39 | callback  |      1 |
| 138da3c8-c118-4ddf-b294-57301261eb97 | cf2b643e-b07f-46c8-a52c-0cb5492e6485 | voice     |      1 |
| 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | callback  |      1 |
| 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | voice     |      2 |
+--------------------------------------+--------------------------------------+-----------+--------+  

我正在尝试根据特定的列值(媒体类型)删除重复的数据。

仅当对话ID和参与ID组合有两个值(语音,回叫)时,才需要排除中介类型为“语音”的行。

我尝试了代码

;with cte as(select conversationid,participantid ,
mediatype,row_number() over (partition by conversationid,participantid order by mediatype ) 
as rownum from #temp
)select * from cte where rownum=1 
order by conversationid,participantid ,mediatype

但是结果不是基于特定值,而是排除了基于字母的结果。我需要一个条件,当重复的行具有mediatype作为语音时,应排除该行。对于没有回调值的其他唯一行,应返回mediatype语音。

2 个答案:

答案 0 :(得分:2)

如果还没有mediatype = 'callback',则希望所有带有mediatype = 'voice'mediatype = 'callback'的行。
在WHERE子句中应用以下条件:

select * from tablename t
where 
  mediatype = 'callback'
  or 
  not exists (
    select 1 from tablename
    where conversationid = t.conversationid and participantid = t.participantid
          and mediatype <> t.mediatype
  ) 

请参见demo
结果:

> conversationid                       | participantid                        | mediatype | rownum
> :----------------------------------- | :----------------------------------- | :-------- | -----:
> 01fda91b-6001-4904-b0bc-8c61aec654b4 | 13f954cb-4acb-4ab9-89e7-c48ece23a043 | callback  |      1
> 0519386a-2b8f-403c-b65b-fd8cc3c09a32 | b033fe2f-a58c-4973-8006-54561b5a5bf7 | voice     |      1
> 085adea7-1deb-45d8-ae61-639255a689ce | 4151d364-5740-4dcf-b756-9772efaacb26 | voice     |      1
> 0c50e9c5-cbe0-4da5-9a8c-976efea255b2 | 8f1ee999-8454-4db9-8c24-68e773350f39 | callback  |      1
> 138da3c8-c118-4ddf-b294-57301261eb97 | cf2b643e-b07f-46c8-a52c-0cb5492e6485 | voice     |      1
> 613c51c6-3c8b-4b53-91de-de3cc004fa92 | 54a84468-e452-4820-9c8a-89904ff97d8d | callback  |      1

答案 1 :(得分:0)

实际上我不清楚您的原始查询为什么不起作用的原因(因为在没有“回调”的情况下,“语音”行将排在第一位),但是当这些是您仅有的三个列值时,这是另一种工作方式结果中需要:

select conversationid, participant, min(mediatype) as mediatype
from #temp
group by conversationid, participantid;

如果您不想依赖字母顺序,请使用大小写表达式:

select conversationid, participant,
    case min(case mediatype when 'callback' then 1 when 'voice' then 2 end)
        when 1 then 'callback' when 2 then 'voice' end as mediatype
from #temp
group by conversationid, participantid;