我正在尝试从一个旧的MySQL表中取出一个blob并为它创建一个新表,以便达到第一个正常形式。但是,要将数据库中已有的数据从blob转换为新表中的多行,结果并非易事。
使用SQL命令实现转换的最简单方法是什么?
父表:
CREATE TABLE TEST.People (
`id` INT AUTO_INCREMENT,
`age` INT,
`height` INT,
`weight` INT ,
`variations` BLOB DEFAULT NULL,
PRIMARY KEY (`id`),
);
新表:
CREATE TABLE TEST.Variations (
`id` INT AUTO_INCREMENT,
`chr` INT,
`start` INT,
`stop` INT ,
`type` ENUM('SNP','INDEL','CNV') DEFAULT NULL,
PRIMARY KEY (`id`),
);
当我运行SELECT id时,变种FROM TEST.People; 我明白了:
+----+----------------------------------------------------------------------------------------------------------------------+
| id | variations |
+----+----------------------------------------------------------------------------------------------------------------------+
| 3 | xp t !3:124093754-124467278/CNVt 7:78030601-79638023/CNV |
| 6 | xp |
| 9 | xp |
| 12 | xp t !1:84289718-85466763/CNV |
| 15 | xp |
| 18 | xp |
| 21 | xp |
| 24 | xp |
| 27 | xp |
| 30 | xp t !10:166909544-166909544/SNPt !2:66903445-66903445/SNPt !2:166897864-166897864/CNVt !7:6892788-6892788/SNP |
+----+----------------------------------------------------------------------------------------------------------------------+
所以我希望转换后的TEST.Variations表是这样的:
+----+-----+-----------+-----------+----------+
| id | chr | start | stop | type |
+----+-----+-----------+-----------+----------+
| 3 | 3 | 124093754 | 124467278 | CNV |
| 3 | 7 | 78030601 | 79638023 | CNV |
| 12 | 1 | 84289718 | 85466763 | CNV |
| 30 | 10 | 166909544 | 166909544 | SNP |
| 30 | 2 | 66903445 | 66903445 | SNP |
| 30 | 2 | 166897864 | 166897864 | CNV |
| 30 | 7 | 6892788 | 6892788 | SNP |
+----+-----+-----------+-----------+----------+
答案 0 :(得分:1)
首先两件事:
您的ID数据不一致3. !
之前没有7:...
。我希望这只是一个错字
xp t !3:124093754-124467278/CNVt 7:78030601-79638023/CNV
^^
如果您希望在目标表中有一个auto_increment
列,那么您的架构应该看起来像这样
CREATE TABLE variations
(
`var_id` INT NOT NULL AUTO_INCREMENT,
`id` INT, -- id from People goes here and it's not UNIQUE
`chr` INT,
`start` INT,
`stop` INT ,
`type` ENUM('SNP','INDEL','CNV') DEFAULT NULL,
PRIMARY KEY (`var_id`)
);
现在您可以使用查询将数据从People
传输到Variations
表
INSERT INTO variations (id, chr, start, stop, type)
SELECT id,
SUBSTRING_INDEX(variation, ':', 1) chr,
SUBSTRING_INDEX(SUBSTRING_INDEX(variation, '-', 1), ':', -1) start,
SUBSTRING_INDEX(SUBSTRING_INDEX(variation, '-', -1), '/', 1) stop,
SUBSTRING_INDEX(variation, '/', -1) type
FROM
(
SELECT p.id, SUBSTRING_INDEX(SUBSTRING_INDEX(p.variations, 't !', n.n), 't !', -1) variation
FROM
(
SELECT id, SUBSTR(variations, 9) variations
FROM people
WHERE variations LIKE 'xp t !%'
) p CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(p.variations) - LENGTH(REPLACE(p.variations, 't !', ''))) / 3
ORDER BY id
) q
ORDER BY id, chr, start, stop, type;
注意:此查询会将每个ID最多分为100个。如果您需要更多或更少,您可以通过使用n
别名编辑内部子查询来调整限制,该别名会动态生成数字(计数)表。
结果:
| VAR_ID | ID | CHR | START | STOP | TYPE | |--------|----|-----|-----------|-----------|------| | 1 | 3 | 3 | 124093754 | 124467278 | CNV | | 2 | 3 | 7 | 78030601 | 79638023 | CNV | | 3 | 12 | 1 | 84289718 | 85466763 | CNV | | 4 | 30 | 10 | 166909544 | 166909544 | SNP | | 5 | 30 | 2 | 166897864 | 166897864 | CNV | | 6 | 30 | 2 | 66903445 | 66903445 | SNP | | 7 | 30 | 7 | 6892788 | 6892788 | SNP |
这是 SQLFiddle 演示