使用两个表中的条件从单行中选择重复项

时间:2013-01-26 21:24:56

标签: sql sqlite

我有两张桌子:

+---------+-----------------+----------+----------+----------+---------+
| TrackId |       URI       | ArtistID |  Title   | AlbumID  | BitRate |
+---------+-----------------+----------+----------+----------+---------+
|  1      | /home/music/... |   234    | atune    |  8958223 |   192   |
|  2      | /music/uri1/... |   427    | goodsong |  222     |   192   |
|  3      | /music/uri2/... |   427    | goodsong |  222     |   128   |
|  4      | /music/uri3/... |   427    | goodsong |  222     |   160   |
|  5      | /home/music/... |   427    | goodsong |  333     |   128   |
|  6      | /home/music/... |   522    | another  |  3458859 |   128   |
+---------+-----------------+----------+----------+----------+---------+

+----------+------------+
| AlbumID  | AlbumTitle |
+----------+------------+
|  8958223 |   titleA   |
|  222     |   titleB   |
|  333     |   titleC   |
|  3458859 |   titleD   |
+----------+------------+

简单地说,我想要的是:

+---------+-----------------+----------+----------+----------+------------+---------+
| TrackId |       URI       | ArtistID |  Title   | AlbumId  | AlbumTitle | BitRate |
+---------+-----------------+----------+----------+----------+------------+---------+
|  3      | /music/uri2/... |   427    | goodsong |  222     |   titleB   |   128   |
|  4      | /music/uri3/... |   427    | goodsong |  222     |   titleB   |   160   |
+---------+-----------------+----------+----------+----------+------------+---------+

这是尝试删除具有以下内容的重复项:

  • 同一标题
  • 同一位艺术家身份
  • 不同的曲目ID
  • 非最高比特率重复
  • 相册表中的相同专辑名称

虽然没有返回具有最高比特率重复的条目

我在这里问了一个非常相似的问题:Select duplicates from a single row?

解决方法是:

SELECT c1.*
  FROM CoreTracks c1
      ,(SELECT Title, ArtistID, MAX(FileSize) AS maxFileSize, MAX(BitRate) maxBitRate
          FROM CoreTracks
          GROUP BY Title, ArtistID) c2
  WHERE c1.Title = c2.Title
    AND c1.ArtistID = c2.ArtistID
    AND (c1.FileSize != c2.maxFileSize AND c1.BitRate != c2.maxBitRate)

......但这次我似乎无法绕过处理另一张桌子。

2 个答案:

答案 0 :(得分:0)

第二个表与您的用例无关,因为您在第一个表中有对AlbumId的FK引用

select t.*, a.AlbumTitle
from Track t
inner join (
    select max(BitRate) as BitRate, ArtistId, Title, AlbumId
    from Track
    group by ArtistId, Title, AlbumId ) b
  on t.ArtistId = b.ArtistId and t.Title = b.Title and t.AlbumId = b.AlbumId and t.BitRate < b.BitRate
inner join Album a
      on t.AlbumId = a.AlbumId

答案 1 :(得分:0)

尝试一下:

SELECT c1.*
FROM CoreTracks c1
LEFT JOIN 
     (SELECT Title, ArtistID, AlbumID, MAX(BitRate) maxBitRate
      FROM CoreTracks
      GROUP BY Title, ArtistID, AlbumID) c2
    ON c1.Title = c2.Title
    AND c1.ArtistID = c2.ArtistID
    AND c1.AlbumID = c2.AlbumID
    AND c1.BitRate = c2.maxBitRate
WHERE c2.Title IS NULL

这是SQL Fiddle

- 编辑

添加相册标题:

SELECT c1.*, a.AlbumTitle
FROM CoreTracks c1 
JOIN Albums a on c1.AlbumId = a.AlbumId
LEFT JOIN 
     (SELECT c.Title, c.ArtistID, c.AlbumID, MAX(c.BitRate) maxBitRate
      FROM CoreTracks c JOIN Albums A ON c.AlbumId = a.AlbumId
      GROUP BY Title, ArtistID, AlbumID) c2
    ON c1.Title = c2.Title
    AND c1.ArtistID = c2.ArtistID
    AND c1.AlbumID = c2.AlbumID
    AND c1.BitRate = c2.maxBitRate
WHERE c2.Title IS NULL

更新后的fiddle

祝你好运。