我正在尝试优化这样的查询...
SELECT master.FIELD_X, rules.FIELD_Y
FROM T_MASTER master
INNER JOIN T_RULES rules ON
master.KEY = rules.KEY AND
(master.FIELD1 = rules.FIELD1 OR rules.FIELD1 IS NULL) AND
(master.FIELD2 = rules.FIELD2 OR rules.FIELD2 IS NULL) AND
...
(master.FIELDN = rules.FIELDN OR rules.FIELDN IS NULL)
WHERE master.KEY = <value>
基本上,
当前,使用上述卷(T_MASTER〜500k / T_RULES〜2k行),此查询大约需要10分钟才能运行(甚至可以接受(批量运行)),我知道可以进一步改善或指出查询/数据库结构上的一些不良设计。
有什么主意吗?
一些规则示例:
Rule_1: If Field2 = 'foo' and Field7 = 'bar' => Field_Y = 'rule_1_value'
Rule_2: If Field3 = 'value' => Field_Y = 'rule_2_value'
答案 0 :(得分:1)
一个蛮力的想法是union all
,它具有所有组合:
SELECT master.FIELD_X, rules.FIELD_Y
FROM T_MASTER master INNER JOIN
T_RULES rules
ON master.KEY = rules.KEY AND
master.FIELD1 = rules.FIELD1
master.FIELD2 = rules.FIELD2
WHERE master.KEY = <value>
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y
FROM T_MASTER master INNER JOIN
T_RULES rules
ON master.KEY = rules.KEY AND
rules.FIELD1 IS NULL AND
master.FIELD2 = rules.FIELD2
WHERE master.KEY = <value>
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y
FROM T_MASTER master INNER JOIN
T_RULES rules
ON master.KEY = rules.KEY AND
master.FIELD1 = rules.FIELD1 AND
rules.FIELD2 IS NULL
UNION ALL
SELECT master.FIELD_X, rules.FIELD_Y
FROM T_MASTER master INNER JOIN
T_RULES rules
ON master.KEY = rules.KEY AND
rules.FIELD1 IS NULL AND
rules.FIELD2 IS NULL;
这不是一个真正令人满意的解决方案。但是,每个子查询都应该能够使用适当的索引。有2 ^ n个子查询,其中n
是要比较的字段数,因此从4或5个字段开始,这变得相当麻烦。
编辑:
这是没有希望的。好吧,不是真的。由于您的数据结构,这是绝望的。您需要一个RulesClauses
表,每个 exact 匹配项都有一行。您的查询将如下所示:
SELECT m.FIELD_X, r.rule_name
FROM T_MASTER master m CROSS APPLY
(VALUES ('FIELD1', Field1),
('FIELD2', Field2),
. . .
('FIELDN', FieldN)
) v(Field, Val) INNER JOIN
T_RULES rules r
ON master.KEY = rules.KEY INNER JOIN
T_RULESCLAUSES rc
ON rc.rules_id = r.rules_id AND
rc.field = v.field AND
rc.val = v.val
WHERE master.KEY = <value>
GROUP BY m.FIELD_X, r.rule_name, r.clause_count
HAVING COUNT(*) = r.clause_count;
现在,每个子句与每个字段之间的JOIN
是一个等值连接,可以利用T_RULESCLAUSES
上的索引。