我有一个表,我想使用MySQL 5.7分区进行分区,以减轻我在快速删除旧数据时遇到的问题。 (另外,通过对除日期之外的其他内容进行分区来提高插入I / O性能会很不错,特别是如果我计划使用子分区在多个卷上进行分片)
以下是该表的简化版本:
CREATE TABLE `tbl` (
`date` date NOT NULL,
`sub_id` int(11) unsigned NOT NULL,
`cmd_id` int(11) NOT NULL,
`code` TINYINT DEFAULT NULL,
`rqst` VARCHAR(32) NOT NULL DEFAULT '',
UNIQUE KEY `uk1` (sub_id,cmd_id,date)
) ENGINE=InnoDB
(note that use of column 'date' in uk1 is only to allow partitioning on date)
(The true unique key is (sub_id,cmd_id))
以下是我在该表上发表的SQL语句:
1. INSERT INTO tbl (NOW(), ...)
2. UPDATE tbl SET code=$code WHERE sub_id=$sub_id AND cmd_id=$cmd_id
3. SELECT code,rqst FROM tbl WHERE sub_id=$sub_id AND cmd_id=$cmd_id
这是我到目前为止设计的分区方案:
PARTITION BY RANGE (TO_DAYS(date))
SUBPARTITION BY HASH(sub_id)
SUBPARTITIONS 4
(PARTITION d001 VALUES LESS THAN (736250) ENGINE = InnoDB,
PARTITION d002 VALUES LESS THAN (736260) ENGINE = InnoDB,
PARTITION d003 VALUES LESS THAN (736270) ENGINE = InnoDB,
PARTITION d004 VALUES LESS THAN (736280) ENGINE = InnoDB,
PARTITION d005 VALUES LESS THAN (736290) ENGINE = InnoDB,
PARTITION d006 VALUES LESS THAN (736300) ENGINE = InnoDB,
PARTITION d007 VALUES LESS THAN (736310) ENGINE = InnoDB,
PARTITION d008 VALUES LESS THAN (736320) ENGINE = InnoDB,
PARTITION d009 VALUES LESS THAN (736330) ENGINE = InnoDB,
PARTITION d010 VALUES LESS THAN (736340) ENGINE = InnoDB,
PARTITION d011 VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
但是我相信这会影响性能,因为每次引用时都要求每个分区读取一次(sub_id,cmd_id):
EXPLAIN PARTITIONS SELECT * FROM tbl WHERE sub_id='107' AND cmd_id='2246806';
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
| 1 | SIMPLE | optz | d001_d001sp1,d002_d002sp1,d003_d003sp1,d004_d004sp1,d005_d005sp1,d006_d006sp1,d007_d007sp1,d008_d008sp1,d009_d009sp1,d010_d010sp1,d011_d011sp1 | ref | uk1 | uk1 | 38 | const,const | 11 | Using where |
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
所以问题的症结在于:
以下是一些注意事项:
date
列,但我无法对其进行分区,因此代码确保(sub_id,cmd_id)在日期中是唯一的代表。谢谢!
答案 0 :(得分:1)
BY HASH
基本没用,SUBPARTITIONs
也是如此。
缓解我在快速删除旧数据方面遇到的问题。
也就是说,旧版DROP PARTITION
需要date
吗?使用PARTITION BY RANGE (TO_DAYS(date))
并且不要打扰子分区。
为清楚起见,请将UNIQUE KEY uk1 (sub_id,cmd_id,date)
更改为PRIMARY KEY (sub_id,cmd_id,date)
。
[belated edited]你的三个查询将合理地运作。由于SELECT
不在UPDATE
子句中,date
和WHERE
必须点击所有分区。 INSERT
只会点击最新的分区(因为NOW()
)。
更多讨论,包括有关定期清除的提示:http://mysql.rjweb.org/doc.php/partitionmaint
只需要保持过去〜月的数据
推荐约32个分区 - 一个待审DROP
,一个future
;看到链接。
复制系统已到位
执行ALTER TABLE
添加分区会使系统失速,但我想你明白了那个问题。
是的,一个必要的邪恶。我不想在唯一键中包含日期列,但后来我无法对其进行分区,因此代码确保(sub_id,cmd_id)在日期中是唯一的。
每天5-20万行
每秒最多几百?如果您有摄取速度问题,请参阅http://mysql.rjweb.org/doc.php/staging_table