我正在尝试加快此脚本以将CSV文件导入MySQL。 对于1000行,加载它需要130秒。试图将它用于30 000行,20分钟后它超时并加载了8681行。
CSV标题看起来像(可以是任意顺序的任意数量的列):
email;param1;..;paramX
test@test.com;something;..;value
MySQL create mail_queue:
CREATE TABLE IF NOT EXISTS `mail_queue` (
`mail_queue_id` INT NOT NULL AUTO_INCREMENT,
`mailer_batch_id` INT NOT NULL,
`to` VARCHAR(100) NOT NULL,
`priority` INT NOT NULL DEFAULT 0,
`created` DATETIME NOT NULL DEFAULT NOW(),
`mail_status_id` INT NOT NULL,
PRIMARY KEY (`mail_queue_id`),
INDEX `fk_mail_queue_mailer_batch1_idx` (`mailer_batch_id` ASC),
INDEX `fk_mail_queue_mail_status1_idx` (`mail_status_id` ASC),
CONSTRAINT `fk_mail_queue_mailer_batch1`
FOREIGN KEY (`mailer_batch_id`)
REFERENCES `mailer_batch` (`mailer_batch_id`)
ON DELETE CASCADE
ON UPDATE NO ACTION,
CONSTRAINT `fk_mail_queue_mail_status1`
FOREIGN KEY (`mail_status_id`)
REFERENCES `mail_status` (`mail_status_id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
MySQL create mail_param:
CREATE TABLE IF NOT EXISTS `mail_param` (
`mail_param_id` INT NOT NULL AUTO_INCREMENT,
`mail_queue_id` INT NOT NULL,
`param_key` VARCHAR(45) NOT NULL,
`param_value` VARCHAR(45) NOT NULL,
PRIMARY KEY (`mail_param_id`),
INDEX `fk_mail_param_mail_queue1_idx` (`mail_queue_id` ASC),
CONSTRAINT `fk_mail_param_mail_queue1`
FOREIGN KEY (`mail_queue_id`)
REFERENCES `mail_queue` (`mail_queue_id`)
ON DELETE CASCADE
ON UPDATE NO ACTION)
ENGINE = InnoDB;
代码(Zend框架)。运作良好,但速度慢:
if (($handle = fopen($this->filepath, 'r')) !== false)
{
// DB
$mailQueueTable = new Application_Model_DbTable_MailQueue();
$mailParamTable = new Application_Model_DbTable_MailParam();
// Get header
$header = \ForceUTF8\Encoding::toUTF8(fgetcsv($handle, 0, ';'));
while(($data = fgetcsv($handle, 0, ';')) !== false)
{
// Save e-mail to e-mail queue
$mailQueueRow = $mailQueueTable->createRow();
$mailQueueRow->mailer_batch_id = $mailerBatchId;
$mailQueueRow->to = $data[$this->emailColumn];
$mailQueueRow->priority = 0;
$mailQueueRow->created = $created->toString('yyyy-MM-dd HH:mm:ss');
$mailQueueRow->mail_status_id = 1;
$mailQueueId = $mailQueueRow->save();
// Save e-mail params
foreach ($data as $key => $value) {
$mailParamRow = $mailParamTable->createRow();
$mailParamRow->mail_queue_id = $mailQueueId;
$mailParamRow->param_key = $header[$key];
$mailParamRow->param_value = \ForceUTF8\Encoding::toUTF8($value);
$mailParamRow->save();
}
unset($data);
}
fclose($handle);
}
我尝试使用LOAD DATA INTO,但由于mail_param表结构,我无法使用它。
1)创建临时表(确定)
$columns = "";
foreach ($this->header as $item) {
if ($columns == "") {
$columns = "`" . $item . "` VARCHAR(45)";
} else {
$columns .= ", `" . $item . "` VARCHAR(45)";
}
}
$query = 'CREATE TEMPORARY TABLE `tmp_csv_import` (
`id` int AUTO_INCREMENT,
' . $columns . '
) ENGINE MyISAM;';
2)LOAD DATA INFILE(OK)
$query = "LOAD DATA INFILE '" . $this->filepath . "'
INTO TABLE `tmp_csv_import`
FIELDS TERMINATED BY ';'
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;";
3)插入mail_queue(确定)
// $mailerBatchId from last_insert_id()
$query = "INSERT INTO `mail_queue` (`mailer_batch_id`, `to`, `priority`, `created`, `mail_status_id`)
SELECT " . $mailerBatchId . ", `email`, 0, NOW(), 1 FROM `tmp_csv_import`";
4)插入mail_param(???)
我不知道在这里写什么。我需要为表tmp_csv_import中的每一列插入新行。我需要获取mail_queue_id - 表mail_param的外键。
$query = "INSERT INTO mail_param (mail_queue_id, param_key, param_value)
SELECT ??? FROM `tmp_csv_import`";
可以在MySQL中执行此操作吗?或者我应该采取不同的方式吗?
答案 0 :(得分:1)
我找到了解决问题的方法。无需使用临时表。
1)数据到" mail_queue"将使用以下代码加载:
$query = "LOAD DATA INFILE '" . $this->filepath . "'
INTO TABLE `mail_queue`
FIELDS TERMINATED BY ';'
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@dummy, @dummy, email, @dummy)
SET `mailer_batch_id` = " . (int) $mailerBatchId . ",
`priority` = 0,
`created` = NOW(),
`mail_status_id` = 1;";
对于导入只有一行,将由标题
生成(@dummy, @dummy, email, @dummy)
其他值将由SET设置。
2)我将选择" mail_queue_id"从插入的值并将其插入到此样式的数组中:
array('to' => 'mail_queue_id');
3)我将从源文件创建 临时CSV文件 。结构:
mail_queue_id;key;value
4)数据将被加载到" mail_param"表:
$query = "LOAD DATA INFILE " . $tmpFilepath . "
INTO TABLE `mail_param`
FIELDS TERMINATED BY ';'
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
(`mail_queue_id`, `param_key`, `param_value`);";
5)
unlink($tmpFilepath)
6)已经完成了。我尝试加载30 000行的CSV,速度明显更快。 (< 1s)。