在大型开放模式表上对左连接结果进行排序

时间:2015-03-09 15:32:21

标签: mysql

我正在设计一个包含以下表格定义的开放式模式数据库

mysql> desc orders;
+-------+---------+------+-----+---------+----------------+
| Field | Type    | Null | Key | Default | Extra          |
+-------+---------+------+-----+---------+----------------+
| ID    | int(11) | NO   | PRI | NULL    | auto_increment |
| json  | text    | NO   |     | NULL    |                |
+-------+---------+------+-----+---------+----------------+

mysql> desc ordersnames;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| ID    | int(11)      | NO   | PRI | NULL    | auto_increment |
| name  | varchar(330) | NO   | UNI | NULL    |                |
+-------+--------------+------+-----+---------+----------------+

名称为

的索引
mysql> desc orderskeys;
+-----------+--------------+------+-----+---------+----------------+
| Field     | Type         | Null | Key | Default | Extra          |
+-----------+--------------+------+-----+---------+----------------+
| ID        | int(11)      | NO   | PRI | NULL    | auto_increment |
| reference | int(11)      | NO   | MUL | NULL    |                |
| nameref   | int(11)      | NO   | MUL | NULL    |                |
| value     | varchar(330) | NO   |     | NULL    |                |
+-----------+--------------+------+-----+---------+----------------+

指数为:

参考,nameref,值

nameref,值

参考

所有json字段(仅1维)在每个现有字段的orderskeys表中都有条目,其中nameref是对ordername中定义的字段名称的引用。

我通常会这样查询:

SELECT
    orderskeysdeliveryPostcode.value deliveryPostcode,
    orders.ID,
    orderskeysCN.value CN
FROM
    orders
JOIN ordersnames as ordersnamesCN   
    on ordersnamesCN.name = 'CN'
JOIN  orderskeys as orderskeysCN
    on orderskeysCN.nameref = ordersnamesCN.ID
    and orderskeysCN.reference = orders.ID
    and orderskeysCN.value = '10094'
JOIN ordersnames as ordersnamesdeliveryPostcode
    on ordersnamesdeliveryPostcode.name = 'deliveryPostcode'
JOIN orderskeys as orderskeysdeliveryPostcode
    on orderskeysdeliveryPostcode.nameref = ordersnamesdeliveryPostcode.ID
    and orderskeysdeliveryPostcode.reference = orders.ID
order by deliveryPostcode
limit 0,1000

产生像这样的结果集

 +------------------+--------+-------+
 | deliveryPostcode | ID     | CN    |
 +------------------+--------+-------+
 | NULL             | 251018 | 10094 |
 | NULL             | 157153 | 10094 |
 | NULL             |  95419 | 10094 |
 | B-5030           | 172944 | 10094 |
 +------------------+--------+-------+

- >即使有400k +订单记录也能快速闪电

但是,并非所有记录都包含所有字段,因此上述查询不会产生没有' deliveryPostcode字段的记录,所以我必须像这样查询

SELECT
    orderskeysdeliveryPostcode.value deliveryPostcode,
    orders.ID,
    orderskeysCN.value CN
FROM
    orders
JOIN ordersnames as ordersnamesCN   
    on ordersnamesCN.name = 'CN'
JOIN  orderskeys as orderskeysCN
    on orderskeysCN.nameref = ordersnamesCN.ID
    and orderskeysCN.reference = orders.ID
    and orderskeysCN.value = '10094'
JOIN ordersnames as ordersnamesdeliveryPostcode
    on ordersnamesdeliveryPostcode.name = 'deliveryPostcode'
LEFT JOIN orderskeys as orderskeysdeliveryPostcode
    on orderskeysdeliveryPostcode.nameref =   ordersnamesdeliveryPostcode.ID
    and orderskeysdeliveryPostcode.reference = orders.ID
limit 0,1000

- >同样快,但只要我在左连接表的键值上添加ORDER BY子句,mysql就想在外部进行排序(临时,filesort),而不是使用现有的索引。

SELECT
    orderskeysdeliveryPostcode.value deliveryPostcode,
    orders.ID,
    orderskeysCN.value CN
FROM
    orders
JOIN ordersnames as ordersnamesCN   
    on ordersnamesCN.name = 'CN'
JOIN  orderskeys as orderskeysCN
    on orderskeysCN.nameref = ordersnamesCN.ID
    and orderskeysCN.reference = orders.ID
    and orderskeysCN.value = '10094'
JOIN ordersnames as ordersnamesdeliveryPostcode
    on ordersnamesdeliveryPostcode.name = 'deliveryPostcode'
LEFT JOIN orderskeys as orderskeysdeliveryPostcode
    on orderskeysdeliveryPostcode.nameref =   ordersnamesdeliveryPostcode.ID
    and orderskeysdeliveryPostcode.reference = orders.ID
ORDER BY deliveryPostCode
limit 0,1000

- >很慢......

实际上排序操作本身并没有太大的不同,因为列deliveryPostcode的所有NULL值都位于开头(ASC)或结束(DESC),而其余数据集的顺序与JOIN相同LEFT JOIN。

如何有效地查询(和订购)此类表格?我需要不同的关系或指数吗?

非常有责任......

2 个答案:

答案 0 :(得分:1)

使用INNER JOIN,为了减少查找次数,MySQL将从具有最少行的表开始(参见EXPLAIN结果以查看MySQL启动的表)。

如果您按照第一个表中的列以外的任何顺序进行排序,或者没有索引来满足第一个表中的ORDER BY子句,那么MySQL将不得不进行一个文件排序。

当涉及文本列时,更有可能使用临时表,而不仅仅是内存中的临时表,而是一个可怕的磁盘上临时表。

使用STRAIGHT_JOIN强制MySQL执行内部联接的顺序。

答案 1 :(得分:-1)

我不确定您在查询的某些部分有什么逻辑。

我认为它仍然可以进行优化。

但只是为了解决您遇到的问题,请立即将其切换为RIGHT JOIN

 SELECT 
  orderskeysdeliveryPostcode.value deliveryPostcode,
  o.id,
  o.CN
FROM orderskeys as orderskeysdeliveryPostcode
INNER JOIN ordersnames as ord_n
    on ord_n.id = orderskeysdeliveryPostcode.nameref
      AND ord_n.name = 'deliveryPostcode'
RIGHT JOIN (
SELECT
    orders.ID,
    orderskeysCN.CN
FROM
    orders
LEFT JOIN 
  (SELECT 
     orderskeys.value as CN,
     orderskeys.reference
   FROM 
    orderskeys 
   INNER JOIN ordersnames as ordersnamesCN   
   ON ordersnamesCN.id = orderskeys.nameref
      AND ordersnamesCN.name = 'CN'
   WHERE orderskeys.value = '12209'
  ) as orderskeysCN
 ON
  orderskeysCN.reference = orders.ID
limit 0,1000
) as o
on 
  orderskeysdeliveryPostcode.reference = o.ID
ORDER BY deliveryPostCode;

这是sqlfiddle我们可以玩的。只需要在那里添加数据插入。