如何规范化具有逗号分隔值的列

时间:2014-10-30 14:08:58

标签: mysql sql

我当前的表格。只显示其他字段的id和流派

    +---------+----------------------+
    | id      | genre                |
    +---------+----------------------+
    | 1849012 | Animation, Short     |
    | 2016229 | Comedy, Crime, Drama |
    |  224412 | Drama, Family        |
    +---------+----------------------+

我创建了nessecery表,我该如何填充它们?

使用'genreid'和'name'字段创建名为genre的表

+---------+-------------+------+-----+---------+----------------+
| Field   | Type        | Null | Key | Default | Extra          |
+---------+-------------+------+-----+---------+----------------+
| genreid | int(11)     | NO   | PRI | NULL    | auto_increment |
| name    | varchar(50) | YES  |     | NULL    |                |
+---------+-------------+------+-----+---------+----------------+.

我还创建了另一个名为movie2genre的表

+---------+---------+------+-----+---------+-------+
| Field   | Type    | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| movieid | int(11) | YES  |     | NULL    |       |
| genreid | int(11) | YES  |     | NULL    |       |
+---------+---------+------+-----+---------+-------+

1 个答案:

答案 0 :(得分:3)

可以分割字符串。生成一系列涵盖最大字符串数的数字。将此与当前表交叉连接并使用SUBSTRING_INDEX(SUBSTRING_INDEX(流派,',',some_generated_number),',', - 1)。这将为您提供每个id的所有类型(尽管最后一个将被复制 - 使用DISTINCT删除它)。这可用于填充您的流派表。

在SQL中将是: -

INSERT INTO genre (genreid, name) 
SELECT DISTINCT NULL, TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(genre, ',', sub0.aCnt), ',', -1)) 
FROM current_table 
CROSS JOIN 
( 
    SELECT units.a + tens.a * 10 AS aCnt 
    FROM 
    (
        SELECT 0 AS a UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
    ) units
    CROSS JOIN
    (
        SELECT 0 AS a UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
    ) tens 
) sub0

然后你可以加入你现有表格的类型表(使用FIND_IN_SET)来填充你的movie2genre表

填充后,您可以使用简单的查询来填充将电影链接到流派的表格: -

INSERT INTO movie2genre 
SELECT current_table.id, genre.id 
FROM current_table 
INNER JOIN genre 
ON FIND_IN_SET(genre.name, REPLACE(current_table.genre, ', ', ','))