我有一个包含3列和数百万行的表。都是整数(哈希)
id, attribute, attrib_val
id可以有许多行,包含属性名称和值的组合。
该表有两个键
id, attribute, attrib_val
attribute, attrib_val, id
我需要动态构建可以根据规则获取id的查询,例如:
以下所有部分应匹配的ID:
attribute <x> contains value <y> or <t>
attribute <l> does not contain value <f> or <c>
...
以下任何部分应匹配的ID:
attribute <x> contains value <y> or <t>
attribute <l> does not contain value <f> or <c>
...
问题: 这是我想出的查询(我可以更改为id not not for the not the parts,并将AND更改为OR以从OR更改为ANY:
SELECT distinct id FROM attributes
WHERE id IN (
SELECT id FROM attributes
WHERE ( (attribute = 12944489 AND attrib_value = 907348202 )
)
AND id IN (
SELECT id FROM attributes
WHERE (
(attribute = 577513892 AND attrib_val = 519655334 )
OR (attribute = 577513892 AND attrib_val = 1266247963 )
)
)
)
问题是此查询效率不高。 出于某种原因,Mysql扫描所有表行,如果我单独运行每个子查询它包含几百行。
如何优化此查询或提出可以有效处理灵活需求的替代查询。 注意:1。Mysql 5.5.31 2.我简化了查询以便于解释。实际上还有一个额外的全局sid列,所有查询都包含每个段中的sid = XXX。
答案 0 :(得分:1)
我建议使用group by
和having
:
SELECT id
FROM attributes
WHERE (attribute, attrib_value) IN ( (12944489, 907348202), (577513892, 519655334), (577513892, 1266247963) )
GROUP BY id
HAVING SUM( (attribute, attrib_value) IN ( (12944489, 907348202) ) ) > 0 AND
SUM( (attribute, attrib_value) IN ( (577513892, 519655334), (577513892, 1266247963) ) ) = 0;
答案 1 :(得分:0)
SELECT id
FROM a AS a1
WHERE attr = 11 AND val IN (22, 33)
AND NOT EXISTS (
SELECT 1 FROM a
WHERE id = a1.id
AND attr = 44
AND val IN (55, 66) )
PRIMARY KEY(id) -- Is this already there? If so, good for inner query
INDEX(attr, val, id) -- needed for outer query