加速MySQL插入选择与10百万条记录

时间:2017-07-31 03:01:55

标签: mysql database indexing insert

我有几张桌子存储每日订单,客户和销售人员。然而,由于列具有不适当的数据值和类型,缺少索引和分区等,因此架构设计得不好。我重新设计了新架构并使用失事表填充新表。我现在不得不填写每日订单表(大约有10M记录)。

附加数据定义和填充表格的SQL脚本。

表格定义

 CREATE TABLE IF NOT EXISTS `testing`.`Orders` (
   `order_ID` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `ord_id` BIGINT UNSIGNED NOT NULL,
  `create_time` DATETIME NOT NULL,
  `create_date` DATE NOT NULL,
  `cust_id` MEDIUMINT UNSIGNED NOT NULL,
  `cust_mob` BIGINT UNSIGNED NULL,
  `sales_id` MEDIUMINT UNSIGNED NULL,
  `sales_mob` BIGINT UNSIGNED NULL,
  `sales_flag` TINYINT UNSIGNED NULL,
  `comm_flag` TINYINT UNSIGNED NULL,
  `extraprice` TINYINT UNSIGNED NULL,
  PRIMARY KEY (`order_ID`),
  INDEX `Date_cust_id` (`create_date` ASC, `cust_id` ASC),
  INDEX `Date_cust_mob` (`create_date` ASC, `cust_mob` ASC),
  INDEX `Date_dri_id` (`create_date` ASC, `sales_id` ASC),
  INDEX `Date_dri_mob` (`create_date` ASC, `sales_mob` ASC),
  INDEX `Date_cust` (`create_date` ASC, `cust_id` ASC, `cstu_mob` ASC),
  INDEX `Date_dri` (`create_date` ASC, `sales_id` ASC, `sales_mob` ASC),
  INDEX `cust` (`cust_id` ASC, `cust_mob` ASC),
  INDEX `dri` (`sales_id` ASC, `sales_mob` ASC),
  UNIQUE INDEX `ord_id_UNIQUE` (`ord_id` ASC)
  )
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8;

此脚本用于填充表格,涉及两个左连接表:带有6xx K记录的Pag表和带有3x k记录的dri表。

SET SQL_SAFE_UPDATES=0;
SET SQL_MODE='';

DROP PROCEDURE IF EXISTS testing.populate_ord1;
DELIMITER $$

CREATE PROCEDURE testing.populate_ord1()
BEGIN
    PREPARE stmt 
       FROM "
            INSERT INTO testing.Orders 
            SELECT 
            1
            ,ord_id
            ,CASE WHEN TRIM(create_time) ='NULL' THEN NULL ELSE STR_TO_DATE(substring(create_time,1,19), '%Y-%m-%d %H:%i:%s') END AS create_time
            ,CASE WHEN TRIM(create_time) ='NULL' THEN NULL ELSE DATE(STR_TO_DATE(substring(create_time,1,19), '%Y-%m-%d %H:%i:%s')) END AS create_date
            ,CASE WHEN TRIM(ord.cust_id) = 'NULL' THEN NULL else pag.cust_id END as cust_id
            ,CASE WHEN TRIM(ord.mob) = 'NULL' THEN NULL else pag.cust_mob END as cust_mob
            ,CASE WHEN TRIM(ord.sales_id) = 'NULL' THEN NULL else dri.sales_id END as sales_id
            ,CASE WHEN TRIM(ord.mob1) = 'NULL' THEN NULL else dri.sales_mob END as sales_mob
            ,CASE WHEN TRIM(sales_flag) ='NULL' THEN NULL ELSE CONVERT(TRIM(sales_flag),UNSIGNED INTEGER) end  AS sales_flag
            ,CASE WHEN TRIM(comm_flag) ='NULL' THEN NULL ELSE  CONVERT(TRIM(comm_flag),UNSIGNED INTEGER)  end AS comm_flag
            ,CASE WHEN TRIM(extraprice) ='NULL' THEN NULL ELSE  CONVERT(TRIM(extraprice),UNSIGNED INTEGER) end AS extraprice

            FROM testing.ord_table ord
                LEFT JOIN
                (SELECT cust_id,customer_id,cust_mob FROM testing.Passenger) pag
                ON TRIM(ord.customer_id) = TRIM(pag.pag_id)
                AND TRIM(ord.mob) = TRIM(pag.passenger_mob)
                LEFT JOIN
                (SELECT sales_id,salesperson_id,sales_mob FROM testing.sales) dri
                ON TRIM(ord.salesperson_id) = TRIM(dri.sales_id)
                AND TRIM(ord.mob1) = TRIM(dri.sales_mob)
            WHERE ord_id  != 'NULL' AND create_time IS NOT NULL AND create_time != 'NULL' AND YEAR(create_time) = ? AND MONTH(create_time) = ? AND DAY(create_time) = ?
            GROUP BY ord_id
            ON DUPLICATE KEY UPDATE  ord_id = ord_id
            ;

            ";


    SET @y = 2014, @m = 9, @d = 1;

    WHILE @y<= 2014 DO
        WHILE @m<= 12 DO
            SET @d = 1;
            WHILE @d<= 31 DO
                EXECUTE stmt USING @y, @m, @d;
                SET @d = @d + 1;
            END WHILE;
            SET @m = @m + 1;
        END WHILE;
        SET @y = @y + 1;
        SET @m = 1;
    END WHILE;
    DEALLOCATE PREPARE stmt;


END$$
DELIMITER ;

set autocommit=0;
call testing.populate_ord1();
COMMIT;

我无法将任何记录填充到表中。有时它会引发锁定等待超时错误或数据类型错误,或者只是花费太长时间(2天)我怀疑它甚至在做任何工作。

我在网上搜索了一下,并将以下设置添加到my.cnf。

innodb_autoinc_lock_mode = 2
innodb_lock_wait_time_out = 150
innodb_flush_log_at_trx_commit =2 
innodb_buffer_pool_size = 14G

有人会建议我如何有效地完成同样的任务吗?上面的代码运行时没有任何语法错误。如果有任何命名混淆,请告诉我,如果我稍微调整那些变量表,这是否至关重要。

1 个答案:

答案 0 :(得分:0)

首先执行

UPDATE ... SET
    comm_flag  = TRIM(comm_flag),
    sales_flag = TRIM(sales_flag),
    ...

这将加快后续查询一些,并简化它们。

然后避免使用LEFT JOIN ( SELECT ... FROM x WHERE ... )。相反,看看你是否可以将其转换为LEFT JOIN x ON ... WHERE ...。这可能有所帮助。

将DATE和TIME拆分为两列通常是个坏主意。或者你有一个很好的论据吗?让我们看看触及那对列的查询。

如果字符串格式正确STR_TO_DATE()DATE,则无需DATETIME。也就是说,字符串工作正常。

TRIM完成后,CONVERT(TRIM(comm_flag),UNSIGNED INTEGER)可以只是comm_flag

不要每天循环浏览一些东西 - 你的结构方式,它将进行全表扫描!大约1000次!! (这可能是最大的性能问题。)