我有针对MySQL运行的查询:
SELECT DISTINCT tp.parts_group as PartsGroup, tpf.code as FeatureCode, CONVERT(tpf.market_id, char) as MarketID
FROM jpt_product_feature tpf
INNER JOIN jpt_product tp
ON tpf.product_id = tp.id
INNER JOIN jpt_product_model tpm
ON tp.model_id = tpm.id
JOIN ModelImport mi
ON tpm.Code = mi.ModelCode
WHERE NOT EXISTS (
SELECT 1
FROM FeatureSequence fs
WHERE tp.parts_group = fs.PartsGroup
AND tpf.code = fs.FeatureCode
AND (tpf.market_id = fs.MarketID or tpf.market_id is null)
)
ORDER BY PartsGroup, FeatureCode, MarketID
它可以在我的PC上运行38秒,考虑到跨多个表的大量行,这很好。但是,此查询在功率较小的VM上运行,将运行约2个小时,然后以FATAL ERROR
结束。
这是我的索引:
CREATE INDEX idxFeatureSequencePartsGroup ON FeatureSequence (PartsGroup);
CREATE INDEX idxToyProductPartsGroup ON jpt_product (parts_group);
CREATE INDEX idxToyProductFeature ON jpt_product_feature (code);
CREATE INDEX idxFeatureSequenceFeatureCode ON FeatureSequence (FeatureCode);
CREATE INDEX idxToyProductFeatureMarketID ON jpt_product_feature (market_id);
CREATE INDEX idxFeatureSequenceMarketID ON FeatureSequence (MarketID);
我们正在努力增强虚拟机,但与此同时,我该怎么做才能加快此查询的速度,对其进行优化,使其更加高效?如果它可以极大地加快查询速度,我什至愿意接受异国情调的/模糊的方法。或者,如果我缺少您认为应该拥有的索引,那可能是一个简单的解决方案。
答案 0 :(得分:1)
相关查询的效率往往比不相关的查询低(如果可能的话)。在这种情况下,我会尝试以下替代方法:
SELECT DISTINCT tp.parts_group as PartsGroup, tpf.code as FeatureCode, CONVERT(tpf.market_id, char) as MarketID
FROM jpt_product_feature tpf
INNER JOIN jpt_product tp ON tpf.product_id = tp.id
INNER JOIN jpt_product_model tpm ON tp.model_id = tpm.id
INNER JOIN ModelImport mi ON tpm.Code = mi.ModelCode
LEFT JOIN (
SELECT DISTINCT 1 AS matchCheck
, fs.PartsGroup AS fsPartsGroup
, fs.FeatureCode AS fsFeatureCode
, fs.MarketID AS fsMarketID
FROM FeatureSequence fs
) AS fs ON tp.parts_group = fs.fsPartsGroup
AND tpf.code = fs.fsFeatureCode
AND (tpf.market_id = fs.fsMarketID OR tpf.market_id is null)
WHERE fs.matchCheck IS NULL
ORDER BY PartsGroup, FeatureCode, MarketID
;
在不知道数据分布细节的情况下,很难说这是否会更快(在某些情况下,相关子查询是最佳选择);但这是我要尝试的第一件事。如果FeatureSequence相较于所涉及的其他表而言相对较大,则相关查询可能仍然更好(相对于一个大表,许多小的命中还是一个大命中)。