使用可选字段优化INNER JOIN

时间:2018-08-22 14:28:24

标签: sql sql-server query-optimization

我正在尝试优化这样的查询...

SELECT master.FIELD_X, rules.FIELD_Y 
    FROM T_MASTER master
    INNER JOIN T_RULES rules ON 
        master.KEY = rules.KEY AND
        (master.FIELD1 = rules.FIELD1 OR rules.FIELD1 IS NULL) AND
        (master.FIELD2 = rules.FIELD2 OR rules.FIELD2 IS NULL) AND
        ...
        (master.FIELDN = rules.FIELDN OR rules.FIELDN IS NULL)
    WHERE master.KEY = <value>

基本上,

  • T_MASTER每个KEY值包含约500.000行。
  • T_RULES表(约2000行)包含用于从T_MASTER中选择记录的规则。
  • 每个规则都可以由几个可选字段指定(可以包含一个值或为null)。

当前,使用上述卷(T_MASTER〜500k / T_RULES〜2k行),此查询大约需要10分钟才能运行(甚至可以接受(批量运行)),我知道可以进一步改善或指出查询/数据库结构上的一些不良设计。

有什么主意吗?

一些规则示例:

Rule_1:  If Field2 = 'foo' and Field7 = 'bar' => Field_Y = 'rule_1_value'
Rule_2:  If Field3 = 'value' => Field_Y = 'rule_2_value'

1 个答案:

答案 0 :(得分:1)

一个蛮力的想法是union all,它具有所有组合:

SELECT master.FIELD_X, rules.FIELD_Y 
FROM T_MASTER master INNER JOIN
     T_RULES rules
     ON master.KEY = rules.KEY AND
        master.FIELD1 = rules.FIELD1 
        master.FIELD2 = rules.FIELD2
WHERE master.KEY = <value>
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y 
FROM T_MASTER master INNER JOIN
     T_RULES rules
     ON master.KEY = rules.KEY AND
        rules.FIELD1 IS NULL AND
        master.FIELD2 = rules.FIELD2
WHERE master.KEY = <value>
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y 
FROM T_MASTER master INNER JOIN
     T_RULES rules
     ON master.KEY = rules.KEY AND
        master.FIELD1 = rules.FIELD1 AND
        rules.FIELD2 IS NULL
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y 
FROM T_MASTER master INNER JOIN
     T_RULES rules
     ON master.KEY = rules.KEY AND
        rules.FIELD1 IS NULL AND
        rules.FIELD2 IS NULL;

这不是一个真正令人满意的解决方案。但是,每个子查询都应该能够使用适当的索引。有2 ^ n个子查询,其中n是要比较的字段数,因此从4或5个字段开始,这变得相当麻烦。

编辑:

这是没有希望的。好吧,不是真的。由于您的数据结构,这是绝望的。您需要一个RulesClauses表,每个 exact 匹配项都有一行。您的查询将如下所示:

SELECT m.FIELD_X, r.rule_name
FROM T_MASTER master m CROSS APPLY
     (VALUES ('FIELD1', Field1),
             ('FIELD2', Field2),
             . . .
             ('FIELDN', FieldN)
     ) v(Field, Val) INNER JOIN

 T_RULES rules r
     ON master.KEY = rules.KEY INNER JOIN
     T_RULESCLAUSES rc
     ON rc.rules_id = r.rules_id AND
        rc.field = v.field AND
        rc.val = v.val
WHERE master.KEY = <value>
GROUP BY m.FIELD_X, r.rule_name, r.clause_count
HAVING COUNT(*) = r.clause_count;

现在,每个子句与每个字段之间的JOIN是一个等值连接,可以利用T_RULESCLAUSES上的索引。