这是我的第一个存储过程,所以我不确定我是否正确执行此操作。我尽可能地尝试优化它,但仍然在运行10分钟时查询超时。我真的需要这个比我目前正在使用的更高。任何帮助都会很棒。
我有一个相当大的数据集(108K行),其中一个字段包含逗号分隔列表(我希望工程师没有这样做)。我需要拆分该字段,以便每个条目都在它自己的行上,并且所有其他字段也分配给该行。我开发了一个存储过程,逐行循环遍历表,然后拆分字段并将其插入第二个表。
以下是我使用的代码:
DROP TABLE IF EXISTS dwh_inventory.nas_share_temp;
CREATE TABLE dwh_inventory.nas_share_temp (
share_id int(11) NOT NULL,
fileShareId int(11) NOT NULL,
storageId int(11) NOT NULL,
identifier varchar(1024) NOT NULL,
name varchar(255) NOT NULL,
protocol enum('CIFS','NFS') NOT NULL,
ipInterfaces VARCHAR(100) NOT NULL
) ENGINE=INNODB DEFAULT CHARSET=utf8;
DROP PROCEDURE IF EXISTS dwh_inventory.share_step;
DELIMITER $$
CREATE PROCEDURE dwh_inventory.share_step()
BEGIN
DECLARE n INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE strLen INT DEFAULT 0;
DECLARE SubStrLen INT DEFAULT 0;
DECLARE ip VARCHAR(20);
SET autocommit = 0;
SELECT COUNT(*) FROM dwh_inventory.nas_share INTO n;
SET i=0;
WHILE i<n DO
SELECT id, fileShareId, storageId, identifier, name, protocol, ipInterfaces
INTO @share_id, @fileShareId, @storageId, @identifier, @name, @protocol, @ipInterfaces
FROM dwh_inventory.nas_share LIMIT i,1;
IF @ipInterfaces IS NULL THEN
SET @ipInterfaces = '';
END IF;
do_this:
LOOP
SET strLen = CHAR_LENGTH(@ipInterfaces);
SET ip = SUBSTRING_INDEX(@ipInterfaces, ',', 1);
INSERT INTO dwh_inventory.nas_share_temp
(share_id, fileShareId, storageId, identifier,name,protocol,ipInterfaces)
VALUES (@share_id,
@fileShareId,
@storageId,
@identifier,
@name,
@protocol,
ip
);
SET SubStrLen = CHAR_LENGTH(SUBSTRING_INDEX(@ipInterfaces, ',', 1)) + 2;
SET @ipInterfaces = MID(@ipInterfaces, SubStrLen, strLen);
IF @ipInterfaces = '' THEN
LEAVE do_this;
END IF;
END LOOP do_this;
COMMIT;
SET i = i + 1;
END WHILE;
SET autocommit = 1;
END;
$$
DELIMITER ;
CALL dwh_inventory.share_step();
数据示例:
id,fileShareId,storageId,identifier,name,protocol,ipInterfaces
1325548,1128971,33309,/vol/vol0/:NFS,/vol/vol0/,NFS,"10.66.213.118,10.68.208.76"
1325549,1128991,33309,/vol/vol0/:NFS,/vol/vol0/,NFS,"10.66.213.119,10.68.208.77"
1325550,1128992,33325,/vol/aggr2_64_hs2032/EPS_ROOT/:NFS,/vol/aggr2_64_hs2032/EPS_ROOT/,NFS,10.17.124.10
1325551,1128993,33325,/vol/aggr2_64_hs2032/GCO_Report/:NFS,/vol/aggr2_64_hs2032/GCO_Report/,NFS,10.17.124.10
1325552,1128995,33325,/vol/aggr2_64_hs2032/PI/:NFS,/vol/aggr2_64_hs2032/PI/,NFS,10.17.124.10
1325553,1128996,33325,/vol/aggr2_64_hs2032/a/:NFS,/vol/aggr2_64_hs2032/a/,NFS,10.17.124.10
1325554,1128997,33325,/vol/aggr1_64_sapserv/:NFS,/vol/aggr1_64_sapserv/,NFS,147.204.2.13
1325555,1128999,33325,/vol/aggr2_64_hs2032/:NFS,/vol/aggr2_64_hs2032/,NFS,10.17.124.10
1325556,1129001,33325,/vol/aggr2_64_hs2032/central/:NFS,/vol/aggr2_64_hs2032/central/,NFS,10.17.124.10
1325557,1129004,33325,/vol/nsvfm0079b_E5V/db_clients/:NFS,/vol/nsvfm0079b_E5V/db_clients/,NFS,"10.21.188.161,10.70.151.93"
1325558,1129006,33325,/vol/aggr2_64_hs2032/istrans/:NFS,/vol/aggr2_64_hs2032/istrans/,NFS,10.17.124.10
1325559,1129008,33325,/vol/nsvfm0017_DEWDFGLD00603/:NFS,/vol/nsvfm0017_DEWDFGLD00603/,NFS,"10.21.188.115,10.70.151.138"
1325560,1129009,33325,/vol/nsvfm0017_vol0/:NFS,/vol/nsvfm0017_vol0/,NFS,"10.21.188.115,10.70.151.138"
1325561,1129011,33325,/vol/nsvfm0017a_ls2278/:NFS,/vol/nsvfm0017a_ls2278/,NFS,"10.21.188.115,10.70.151.138"
1325562,1129015,33325,/vol/nsvfm0051passive_vol0/:NFS,/vol/nsvfm0051passive_vol0/,NFS,10.17.144.249
1325563,1129017,33325,/vol/nsvfm0053_vol0/:NFS,/vol/nsvfm0053_vol0/,NFS,"10.21.189.251,10.70.151.109"
答案 0 :(得分:0)
InnoDB表必须有PRIMARY KEY
。
LIMIT i,1
会越来越慢 - 它必须跳过i
行才能找到您需要的行。
不要尝试在SQL中拆分逗号分隔的文本;使用真正的语言(PHP / Perl / etc)。或者,正如Lew建议的那样,写出该列,然后使用LOAD DATA
将其带入另一个表。
LIMIT
前面应加ORDER BY
。