Question

我有一种情况，我在数据库表中有几十万行，比如8列，前两列索引（每列两个索引，两列都有一个复合索引），我有两个带有group by和union的SQL查询，如：

SELECT MIN(columnOne), columnTwo FROM MyTable
WHERE columnTwo IN (1,2,3)
GROUP BY columnTwo

和

SELECT MIN(columnOne), columnTwo FROM MyTable WHERE columnTwo = 1
UNION
SELECT MIN(columnOne), columnTwo FROM MyTable WHERE columnTwo = 2
UNION
SELECT MIN(columnOne), columnTwo FROM MyTable WHERE columnTwo = 3

似乎第二种方法联盟比第一种更强两次（有时更多）。

我正在用Python执行这些查询，所以第一个是一个班轮，第二个是我需要生成的。

我想知道第二种方法是否正常，可能还有第三种方法我不知道？

更新：

所有查询中

columnTwo 和 columnOne 字段不唯信

实施例

# columnOne columnTwo
1 a         a        
2 b         b        
3 c         b        
4 d         a        
...

使用group by解释查询：

id  select_type    table        type    possible_keys               key       key_len           ref     rows    Extra
1   SIMPLE         MyTable      index   secondColIndex,bothColIndex bothColIndex    12                 1623713   Using where

向工会解释查询显示：

id  select_type    table        type    possible_keys               key       key_len   ref     rows    Extra
1   PRIMARY        MyTable      ref     secondColIndex,bothColIndex bothColIndex    4   const   217472  Using where
2   UNION          MyTable      ref     secondColIndex,bothColIndex bothColIndex    4   const   185832  Using where
3   UNION          MyTable      ref     secondColIndex,bothColIndex bothColIndex    4   const   175572  Using where
    UNION RESULT   <union1,2,3> ALL                                     Using temporary

MyTable中的索引：

Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
MyTable, 0, PRIMARY, 1, Id, A, 1623713, , , , BTREE, , 
MyTable, 1, columnOneIndex, 1, columnOne, A, 1623713, , , , BTREE, , 
MyTable, 1, columnTwoIndex, 1, columnTwo, A, 5737, , , , BTREE, , 
MyTable, 1, bothColumnsIndex, 1, columnTwo, A, 5171, , , , BTREE, , 
MyTable, 1, bothColumnsIndex, 2, columnOne, A, 1623713, , , , BTREE, ,

Answer 1

您所看到的是由于MySQL优化器的限制（在最新版本中可能会有很大改进）。 GROUP BY几乎总是导致文件排序，限制了索引的使用。

一种替代方案基本上只是简化UNION版本，但使用相关子查询：

SELECT x.columnTwo,
       (SELECT MIN(columnOne)
        FROM myTable t
        WHERE t.columnTwo = x.columnTwo
       ) as min_columnOne
FROM (SELECT 1 as columnTwo UNION ALL
      SELECT 2 as columnTwo UNION ALL
      SELECT 3 as columnTwo
     ) x;

这与UNION版本的性能基本相同。相关子查询应使用索引进行计算。

UNION vs GROUP BY或更好的解决方案

1 个答案: