Question

我有一个包含3列和数百万行的表。都是整数（哈希） id, attribute, attrib_val

id可以有许多行，包含属性名称和值的组合。

该表有两个键 id, attribute, attrib_val attribute, attrib_val, id

我需要动态构建可以根据规则获取id的查询，例如：

以下所有部分应匹配的ID： attribute <x> contains value <y> or <t> attribute <l> does not contain value <f> or <c> ...

以下任何部分应匹配的ID： attribute <x> contains value <y> or <t> attribute <l> does not contain value <f> or <c> ...

问题：这是我想出的查询（我可以更改为id not not for the not the parts，并将AND更改为OR以从OR更改为ANY：

SELECT distinct id FROM attributes 
WHERE id IN (
  SELECT id FROM attributes 
  WHERE  ( (attribute = 12944489 AND attrib_value =  907348202 ) 
) 
AND id IN (
  SELECT  id FROM attributes 
  WHERE ( 
    (attribute = 577513892 AND attrib_val = 519655334 ) 
    OR (attribute = 577513892 AND attrib_val = 1266247963 ) 
  ) 
  )
)

问题是此查询效率不高。出于某种原因，Mysql扫描所有表行，如果我单独运行每个子查询它包含几百行。

如何优化此查询或提出可以有效处理灵活需求的替代查询。注意：1。Mysql 5.5.31 2.我简化了查询以便于解释。实际上还有一个额外的全局sid列，所有查询都包含每个段中的sid = XXX。

Answer 1

我建议使用group by和having：

SELECT id
FROM attributes 
WHERE (attribute, attrib_value) IN ( (12944489, 907348202), (577513892, 519655334), (577513892, 1266247963) ) 
GROUP BY id 
HAVING SUM( (attribute, attrib_value) IN ( (12944489, 907348202) ) ) > 0 AND
       SUM( (attribute, attrib_value) IN ( (577513892, 519655334), (577513892, 1266247963) ) ) = 0;

Answer 2

SELECT id
    FROM a AS a1
    WHERE attr = 11 AND val IN (22, 33)
      AND NOT EXISTS (
              SELECT 1 FROM a
                  WHERE id = a1.id
                    AND attr = 44
                    AND val IN (55, 66) )

PRIMARY KEY(id)   -- Is this already there?  If so, good for inner query
INDEX(attr, val, id)  -- needed for outer query

用于获取存在/不存在的任何组合的MySQL查询

2 个答案: