在SQL中合并类似ID的行?

时间:2011-10-09 14:51:41

标签: mysql sql subquery alias

我现在有一个有趣的困境。我有一个类似于以下的数据库模式:

GameList:
+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  |   NULL    |    NULL    |      Dragon Age: Origins       |
|   2   |   NULL   | 138994900 |    NULL    |    Dragon Age(TM): Origins     |
|   3   |   NULL   |   NULL    |  dragonage |      Dragon Age Origins        |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
+-------+----------+-----------+------------+--------------------------------+

GameAlias:
+----------+-----------+
|  old_id  |  new_id   |
+----------+-----------+
|    2     |     1     |
|    3     |     1     |
|   ...    |    ...    |
+----------+-----------+

根据商店是否对游戏使用相同的标题,可能没有问题,或者同一游戏可能有多行。 Alias表存在以解决此问题,通过声明id 2和id 3只是id 1的别名。

我需要的是一个SQL查询,它使用GameList表和GameAlias表并返回以下内容:

ConglomerateGameList:
+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  | 138994900 |  dragonage |      Dragon Age: Origins       |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
+-------+----------+-----------+------------+--------------------------------+

请注意,我想要“新ID”的游戏标题。任何“旧ids”的游戏标题都应该被丢弃/忽略。

我还要注意,我不能对GameList表进行任何修改来解决这个问题。如果我只是简单地重新编写表格看起来像我想要的输出那么每天晚上当我从商店中获取更新的游戏列表时,它将无法在数据库中找到游戏,生成另一行如下:

+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  | 138994900 |  dragonage |      Dragon Age: Origins       |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
|  8139 |   NULL   | 138994900 |    NULL    |     Dragon Age(TM): Origins    |
|  8140 |   NULL   |    NULL   |  dragonage |      Dragon Age Origins        |
+-------+----------+-----------+------------+--------------------------------+

我也不能假设游戏的id永远不会改变,因为当游戏发布重大更新时,Steam已经知道会改变它们。

奖励积分,如果它可以识别递归别名,如下所示:

GameAlias:
+----------+-----------+
|  old_id  |  new_id   |
+----------+-----------+
|    2     |     1     |
|    3     |     2     |
|   ...    |    ...    |
+----------+-----------+

因为id 3是id 2的别名,它本身是id 1的别名。如果递归别名是不可能的,那么我可以开发我的应用程序逻辑来防止它们。

2 个答案:

答案 0 :(得分:2)

这有用吗?更正表名。

select ga1.new_id, max(gl1.steam_id), max(gl1.origin_id), max(gl1.impulse_id),
max(if(gl1.id = ga1.new_id,gl1.game_title,NULL)) as game_title
from gl1, ga1
where (gl1.id = ga1.new_id OR gl1.id = ga1.old_id)
group by ga1.new_id

union

select gl2.id, gl2.steam_id, gl2.origin_id, gl2.impulse_id, gl2.game_title
from gl2, ga2
where (gl2.id not in (
    select ga3.new_id from ga3
    union
    select ga4.old_id from ga4))

答案 1 :(得分:0)

1.首先解决方案(没有递归):

CREATE TABLE GameList
(
     id         INT NOT NULL PRIMARY KEY
    ,steam_id   INT NULL
    ,origin_id  INT NULL
    ,impulse_id NVARCHAR(50) NULL            
    ,game_title NVARCHAR(50) NOT NULL
);
INSERT  GameList(id, steam_id, origin_id, impulse_id, game_title)
SELECT  1,  17450,  NULL,       NULL,       'Dragon Age: Origins'
UNION ALL
SELECT  2,  NULL,   138994900,  NULL,       'Dragon Age(TM): Origins'
UNION ALL
SELECT  3,  NULL,   NULL,       'dragonage','Dragon Age Origins'   
UNION ALL
SELECT  4,  47850,  201841300,  'fifamgr11','FIFA Manager 11';

CREATE TABLE GameAlias
(
    old_id INT NOT NULL PRIMARY KEY
    ,new_id INT NOT NULL
);

INSERT  GameAlias (old_id, new_id) VALUES (2,1);
INSERT  GameAlias (old_id, new_id) VALUES (3,1);

-- Solution 1
SELECT  COALESCE(ga.new_id, gl.id) new_id
        ,MAX(gl.steam_id) new_steam_id
        ,MAX(gl.origin_id) new_origin_id
        ,MAX(gl.impulse_id) new_impulse_id
        ,MAX( CASE WHEN ga.old_id IS NULL THEN gl.game_title ELSE NULL END ) new_game_title
FROM    GameList gl
LEFT OUTER JOIN GameAlias ga ON gl.id = ga.old_id
GROUP BY COALESCE(ga.new_id, gl.id);
-- End of Solution 1    
DROP TABLE GameList;
DROP TABLE GameAlias;

结果:

1   17450   138994900   dragonage   Dragon Age: Origins
4   47850   201841300   fifamgr11   FIFA Manager 11

2.第二个解决方案(递归级别=三个级别):

CREATE TABLE GameList
(
     id         INT NOT NULL PRIMARY KEY
    ,steam_id   INT NULL
    ,origin_id  INT NULL
    ,impulse_id NVARCHAR(50) NULL            
    ,game_title NVARCHAR(50) NOT NULL
);
INSERT  GameList(id, steam_id, origin_id, impulse_id, game_title)
SELECT  1,  17450,  NULL,       NULL,       'Dragon Age: Origins'
UNION ALL
SELECT  2,  NULL,   138994900,  NULL,       'Dragon Age(TM): Origins'
UNION ALL
SELECT  3,  NULL,   NULL,       'dragonage','Dragon Age Origins'   
UNION ALL
SELECT  4,  47850,  201841300,  'fifamgr11','FIFA Manager 11'
UNION ALL
SELECT  5,  11111,  NULL,       NULL,       'Starcraft 1'
UNION ALL
SELECT  6,  NULL,   1111111111, NULL,       'Starcraft 1.1'   
UNION ALL
SELECT  7,  NULL,   NULL,       NULL,      'Starcraft 1.2'
UNION ALL
SELECT  8,  NULL,   NULL,       'sc1',      'Starcraft 1.3';

CREATE TABLE GameAlias
(
    old_id INT NOT NULL PRIMARY KEY
    ,new_id INT NOT NULL
);

INSERT  GameAlias (old_id, new_id) VALUES (2,1);
INSERT  GameAlias (old_id, new_id) VALUES (3,1);
INSERT  GameAlias (old_id, new_id) VALUES (6,5);
INSERT  GameAlias (old_id, new_id) VALUES (7,6);
INSERT  GameAlias (old_id, new_id) VALUES (8,7);

-- Solution 2
CREATE TEMPORARY TABLE Mappings
(
    old_id INT NOT NULL PRIMARY KEY
    ,new_id INT NOT NULL
);
INSERT  Mappings (old_id, new_id)
-- first level mapping
SELECT  ga.old_id, ga.new_id
FROM    GameAlias ga
WHERE   ga.new_id NOT IN (SELECT t.old_id FROM GameAlias t)
-- second level mapping
UNION ALL
SELECT  ga.old_id, ga2.new_id
FROM    GameAlias ga
INNER JOIN GameAlias ga2 ON ga.new_id = ga2.old_id
WHERE   ga2.new_id NOT IN (SELECT t.old_id FROM GameAlias t)
-- third level mapping
UNION ALL
SELECT  ga.old_id, ga3.new_id
FROM    GameAlias ga
INNER JOIN GameAlias ga2 ON ga.new_id = ga2.old_id
INNER JOIN GameAlias ga3 ON ga2.new_id = ga3.old_id;

SELECT  COALESCE(ga.new_id, gl.id) new_id
        ,MAX(gl.steam_id) new_steam_id
        ,MAX(gl.origin_id) new_origin_id
        ,MAX(gl.impulse_id) new_impulse_id
        ,MAX( CASE WHEN ga.old_id IS NULL THEN gl.game_title ELSE NULL END ) new_game_title
FROM    GameList gl
LEFT OUTER JOIN Mappings ga ON gl.id = ga.old_id
GROUP BY COALESCE(ga.new_id, gl.id);

DROP TEMPORARY TABLE Mappings;
-- End of Solution 2

DROP TABLE GameList;
DROP TABLE GameAlias;

结果:

1   17450   138994900   dragonage   Dragon Age: Origins
4   47850   201841300   fifamgr11   FIFA Manager 11
5   11111   1111111111  sc1         Starcraft 1

对不起,MySQL没有递归查询/ CTE。