在同一个表上使用许多内部联接优化SQL查询

时间:2017-07-13 08:45:27

标签: mysql join query-performance entity-attribute-value

我遇到了性能问题:

商店有一个文章过滤器,其中包含类别"颜色","尺寸","性别"和"功能"。所有这些细节都存储在article_criterias表中,如下所示:

article_criterias的表格布局是;这个表有大约36.000行:

article_id | group    | option | option_val
       100 | "size"   | "35"   |     35.00
       100 | "size"   | "36"   |     36.00
       100 | "size"   | "36½"  |     36.50
       100 | "color"  | "40"   |     40.00
       100 | "color"  | "50"   |     50.00
       100 | "gender" | "1"    |      1.00
       101 | "size"   | "40"   |     40.00
       ...

我们根据当前选择的条件动态构建SQL查询。该查询适用于2-3个条件,但在选择5个以上的选项时会变得非常慢(每个额外的INNER JOIN大约会使执行时间加倍)

我们如何才能使这个SQL更快,甚至可以用更高效的概念替换内部联接?

这是查询(逻辑正确,只是性能不好):

-- This SQL is generated when the user selected the following criteria
-- gender: 1
-- color: 80 + 30
-- size 36 + 37 + 38 + 39 + 42 + 46
SELECT
    criteria.group AS `key`,
    criteria.option AS `value`
FROM articles
    INNER JOIN article_criterias AS criteria ON articles.id = criteria.article_id
    INNER JOIN article_criterias AS criteria_gender 
        ON criteria_gender.article_id = articles.id AND criteria_gender.group = "gender"
    INNER JOIN article_criterias AS criteria_color1 
        ON criteria_color1.article_id = articles.id AND criteria_color1.group = "color"
    INNER JOIN article_criterias AS criteria_size2 
        ON criteria_size2.article_id = articles.id AND criteria_size2.group = "size"
    INNER JOIN article_criterias AS criteria_size3 
        ON criteria_size3.article_id = articles.id AND criteria_size3.group = "size"
    INNER JOIN article_criterias AS criteria_size4 
        ON criteria_size4.article_id = articles.id AND criteria_size4.group = "size"
    INNER JOIN article_criterias AS criteria_size5 
        ON criteria_size5.article_id = articles.id AND criteria_size5.group = "size"
    INNER JOIN article_criterias AS criteria_size6 
        ON criteria_size6.article_id = articles.id AND criteria_size6.group = "size"
    INNER JOIN article_criterias AS criteria_size7 
        ON criteria_size7.article_id = articles.id AND criteria_size7.group = "size"
WHERE
    AND (criteria_gender.option IN ("1"))
    AND (criteria_color1.option IN ("80", "30"))
    AND (criteria_size2.option_val BETWEEN 35.500000 AND 36.500000)
    AND (criteria_size3.option_val BETWEEN 36.500000 AND 37.500000)
    AND (criteria_size4.option_val BETWEEN 37.500000 AND 38.500000)
    AND (criteria_size5.option_val BETWEEN 38.500000 AND 39.500000)
    AND (criteria_size6.option_val BETWEEN 41.500000 AND 42.500000)
    AND (criteria_size7.option_val BETWEEN 45.500000 AND 46.500000)

3 个答案:

答案 0 :(得分:2)

键/值表确实令人讨厌。但是,为了找到某些标准匹配汇总您的数据:

'hello'
'something'

这将为您提供所有可用于性别1,颜色30和/或80以及所有列出的尺寸范围的文章及其所有选项。 (但是大小范围有点奇怪;例如36.5大小会遇到两个范围。)你明白了:按article_id分组并使用select a.*, ac.group AS "key", ac.option AS "value" from articles a join article_criterias ac on ac.article_id = a.article_id where a.article_id in ( select article_id from article_criterias group by article_id having sum("group" = 'gender' and option = '1') > 0 and sum("group" = 'color' and option in ('30','80')) > 0 and sum("group" = 'size' and option_val between 35.5 and 36.5) > 0 and sum("group" = 'size' and option_val between 36.5 and 37.5) > 0 and sum("group" = 'size' and option_val between 37.5 and 38.5) > 0 and sum("group" = 'size' and option_val between 38.5 and 39.5) > 0 and sum("group" = 'size' and option_val between 41.5 and 42.5) > 0 and sum("group" = 'size' and option_val between 45.5 and 46.5) > 0 ) order by a.article_id, ac.group, ac.option; 以便只获得符合批评的article_ids。

至于你想要的索引

HAVING

答案 1 :(得分:0)

根据@ affan-pathan建议添加索引确实解决了问题:

CREATE INDEX text_option 
ON `article_criterias` (`article_id`, `group`, `option`);

CREATE INDEX numeric_option 
ON `article_criterias` (`article_id`, `group`, `option_val`);

这两个索引将上述查询表单的执行时间缩短了近1分钟,不到50毫秒!!

答案 2 :(得分:0)

我了解您创建的索引解决了您的问题, 但只是玩伪替代(避免多个INNER JOIN),你能尝试这样的东西吗? (我只用三个条件测试。你的条件应该插入内部查询。要只选择满足所有条件的记录,你必须改变最后的WHERE条件(WHERE max = 3,使用你在上面写的条件数;因此,如果你使用5个条件,你应该写WHERE max = 5)。(为了方便起见,我更改了列组和选项的名称)。  这只是一个想法,所以请做一些测试并检查性能,请告诉我......

CREATE TABLE CRITERIA (ARTICLE_ID INT, GROU VARCHAR(10), OPT VARCHAR(20), OPTION_VAL NUMERIC(12,2));
CREATE TABLE ARTICLES (ID INT);
INSERT INTO CRITERIA VALUES (100,'size','35',35);
INSERT INTO CRITERIA VALUES (100,'size','36',36);
INSERT INTO CRITERIA VALUES (100,'color','40',40);
INSERT INTO CRITERIA VALUES (100,'gender','1',1);
INSERT INTO CRITERIA VALUES (200,'size','36.2',36.2);
INSERT INTO CRITERIA VALUES (300,'size','36.2',36.2);
INSERT INTO ARTICLES VALUES (100);
INSERT INTO ARTICLES VALUES (200);
INSERT INTO ARTICLES VALUES (300);

-------------------------------------------------------

SELECT D.article_id, D.GROU, D.OPT
FROM (SELECT C.*
     , @o:=CASE WHEN @h=ARTICLE_ID THEN @o ELSE cumul END max
     , @h:=ARTICLE_ID AS a_id
     FROM (SELECT article_id,
             B.GROU, B.OPT,             
             @r:= CASE WHEN @g = B.ARTICLE_ID THEN @r+1 ELSE 1 END cumul,                        
             @g:= B.ARTICLE_ID g                
             FROM CRITERIA B
             CROSS JOIN (SELECT @g:=0, @r:=0) T1
             WHERE (B.GROU='gender' AND B.OPT IN ('1'))
                    OR  (B.GROU='color'  AND B.OPT IN ('40', '30'))
                    OR  (B.GROU='size'   AND B.OPT BETWEEN 35.500000 AND 36.500000)
             ORDER BY article_id
    ) C
CROSS JOIN (SELECT @o:=0, @h:=0) T2
ORDER BY ARTICLE_ID, CUMUL DESC) D
WHERE max=3
;

输出:

article_id  GROU    OPT
100 gender  1
100 color   40
100 size    36