数据结构:
主桌(500万):
CREATE TABLE
USER_DETAILS
(
visitor_id
varchar(50)DEFAULT NULL,partition_id INT,
related_text longtext,
creation_date
时间戳DEFAULT CURRENT_TIMESTAMP,PRIMARY KEY(
visitor_id
,partition_id))ENGINE = TokuDB
按列表分区(partition_id)(
PARTITION p0 VALUES IN(0)ENGINE = TokuDB,
PARTITION p1 VALUES IN(1)ENGINE = TokuDB,
PARTITION p2 VALUES IN(2)ENGINE = TokuDB,
PARTITION p3 VALUES IN(3)ENGINE = TokuDB,
PARTITION p4 VALUES IN(4)ENGINE = TokuDB,
PARTITION p5 VALUES IN(5)ENGINE = TokuDB,
PARTITION p6 VALUES IN(6)ENGINE = TokuDB,
PARTITION p7 VALUES IN(7)ENGINE = TokuDB,
PARTITION p8 VALUES IN(8)ENGINE = TokuDB,
PARTITION p9 VALUES IN(9)ENGINE = TokuDB);
中级表(10-20百万):
CREATE TABLE
USER_DETAILS_INTERMEDIATE
(
id
bigint(20)NOT NULL AUTO_INCREMENT,visitor_id` varchar(50)DEFAULT NULL,
partition_id
int(11)DEFAULT NULL,
related_text
longtext,PRIMARY KEY(id
));
问题:
将数据从中间表传输到主表时花费太多时间。
我尝试了以下解决方案:
解决方案1:
替换为USER_DETAILS(visitor_id,partition_id,json_list)
SELECT visitor_id,partition_id,related_text
FROM USER_DETAILS_INTERMEDIATE a
解决方案2(在循环中运行以下语句:每个循环10000行):
替换为USER_DETAILS(visitor_id,partition_id,json_list)
SELECT visitor_id,partition_id,related_text
FROM USER_DETAILS_INTERMEDIATE a
WHERE id BETWEEN var_min_id AND var_max_id;
以上查询正在花时间。
还有其他方法可以改善这种情况吗?