以下查询到目前为止需要5个小时才能运行:
INSERT $LINEITEM_PUBLIC SELECT *
FROM LINEITEM
WHERE L_PARTKEY IN ( SELECT P_PARTKEY FROM $PART_PUBLIC )
AND L_SUPPKEY IN ( SELECT S_SUPPKEY FROM $SUPPLIER_PUBLIC )
AND L_ORDERKEY IN ( SELECT O_ORDERKEY FROM $ORDERS_PUBLIC );
我添加了所有必需的索引,但似乎没有任何帮助。查询说明计划打印以下内容:
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
| 1 | INSERT | $LINEITEM_PUBLIC | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 1 | SIMPLE | $ORDERS_PUBLIC | NULL | index | PRIMARY | O_ORDERDATE | 3 | NULL | 12826617 | 100.00 | Using index |
| 1 | SIMPLE | LINEITEM | NULL | ref | PRIMARY,LINEITEM_FK2,L_SUPPKEY | PRIMARY | 4 | TPCH.$ORDERS_PUBLIC.O_ORDERKEY | 3 | 100.00 | NULL |
| 1 | SIMPLE | $SUPPLIER_PUBLIC | NULL | eq_ref | PRIMARY | PRIMARY | 4 | TPCH.LINEITEM.L_SUPPKEY | 1 | 100.00 | Using index |
| 1 | SIMPLE | $PART_PUBLIC | NULL | eq_ref | PRIMARY | PRIMARY | 4 | TPCH.LINEITEM.L_PARTKEY | 1 | 100.00 | Using index |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
有关如何优化此查询的任何建议?
更新 上一个查询中表的大小如下:
答案 0 :(得分:0)
确保索引以O_ORDERKEY
开头。
IN (SELECT ...)
可能效果不佳(取决于版本);试试这个:
INSERT $LINEITEM_PUBLIC
SELECT l.*
FROM LINEITEM AS l
WHERE EXISTS( SELECT * FROM $PART_PUBLIC WHERE P_PARTKEY = L_PARTKEY )
AND EXISTS( SELECT * FROM $SUPPLIER_PUBLIC WHERE S_SUPPKEY = L_SUPPKEY )
AND EXISTS( SELECT * FROM $ORDERS_PUBLIC WHERE O_ORDERKEY = L_ORDERKEY );