添加第三个LEFT JOIN时查询速度非常慢

时间:2014-04-30 03:58:01

标签: mysql sql join

大家好,我一直在玩这个查询时间,而且我无法在合理的执行时间内返回结果。

情况如下:

我有三张桌子 -

表1称为:rowsall

 1  id          int(11) 
 2  masterCaseId    varchar(50)
 3  RowNum  int(11)
 4  fullCaseNumber  varchar(50)
 5  rowKtavNameFull varchar(250)
 6  DateOpen    varchar(50)
 7  DateProccess    varchar(50)
 8  rowStatus   varchar(50)
 9  rowCourt    varchar(100)
 10 rowProcedure    varchar(50)
 11 rowCaseType varchar(50)
 12 rowIntrest  varchar(50)
 13 rowDetailsGen   varchar(250)
 14 rowTypeTeanot   varchar(50)
 15 rowHisayon  varchar(50)
 16 rowAmount   varchar(50)
 17 rowZacautPtor   varchar(50)
 18 rowZacautApproove   varchar(50)
 19 rowStatIravon   varchar(50)
 20 rowDateClose    varchar(50)
 21 rowCloseReason  varchar(50)
 22 rowResultTaken  varchar(50)
 23 rowOldFile          varchar(50)
 24 rowOpenedInCourse   varchar(50)
 25 rowGniza            varchar(50)
 26 rowReasonDeposit    varchar(50)
 27 rowTypeJudgeType    varchar(50)
 28 rowJudgeTypeDate
 29 rowJudgeTypeName    varchar(50)
 30 rowGishurType   varchar(50)
 31 rowGishurDetails    varchar(250)

   Total rows: 13001, size 11.7mb

   Indexes:
   PRIMARY  BTREE   Yes No  id  13001   A   No  
   RowNum   BTREE   No  No  RowNum  12  A   No  
                            rowStatus   12  A   No
                            rowResultTaken  12  A   No
   rowJudgeTypeName BTREE   No  No  rowJudgeTypeName    1083    A   No  
   masterCaseId BTREE   No  No  masterCaseId    13001   A   No  
   RowNum_2 BTREE   No  No  rowJudgeTypeName    1857    A   No  
                            RowNum  1857    A   No
   fullCaseNumber   BTREE   No  No  fullCaseNumber  203 A   No  

表2称为:casses_rows

 1  id  int(11)
 2  caseFullNum varchar(50)
 3  statusCrawl varchar(50)
 4  courtPlace  text
 5  rowsNum int(11)
 6  caseJudge   varchar(50)
 7  caseFullName    text
 8  whenCrawled datetime
 9  yearVal varchar(5)
 10 monthVal    varchar(5)
 11 caseVal int(11)

   Total rows: ~23,846, size 4.8mb

   Indexes:
   PRIMARY  BTREE   Yes No  id  26302   A   No  

表3称为:casedocs

 1  id  int(11)
 2  caseNum varchar(20)
 3  DocTitle    varchar(250)
 4  DocDateStr  varchar(20)
 5  KeyWords    text
 6  content text
 7  DocDateParsed   timestamp

   Total rows: ~1,163,669, size 4.1g

   Indexes:
   PRIMARY  BTREE   Yes No  id  895132  A   No  
   caseNum  BTREE   No  No  caseNum 895132  A   No  

我的目标:

我需要连接这些表来获取table1中的大多数col +表2中的一个col +表3中的一个col,如果没有匹配则为NULL:

我的查询是:

SELECT 
       A.`id` AS idRowCase, 
       C.`caseNum` AS isPaperAva, 
       A.`rowCaseType`, 
       A.`fullCaseNumber`, 
       A.`rowProcedure`, 
       B.`caseFullName`, 
       A.`rowCourt`, 
       A.`rowAmount`, 
       A.`rowResultTaken`, A.`rowStatus`, A.`rowIntrest` ,A.`DateOpen` ,A.`DateProccess`, A.`rowDateClose`, A.`rowJudgeTypeDate` 

FROM (SELECT * FROM `rowsall` WHERE `rowJudgeTypeName` LIKE '%@value1%' AND `RowNum` ='1' ) A 
INNER JOIN ( SELECT `id`,`caseFullName` FROM `casses_rows` ) B 
      ON A.`masterCaseId` = B.`id` 
LEFT JOIN (SELECT `caseNum` FROM `casedocs` GROUP BY `caseNum` ORDER BY NULL ) C 
      ON A.`fullCaseNumber` = C.`caseNum`

结果是我想要的,但问题是 1分钟返回结果......

以下是EXPLAIN:

  id   select_type  table       type   possible_keys  key     key_len  ref  rows   Extra
  1    PRIMARY      <derived2>  ALL    NULL           NULL    NULL     NULL 121
  1    PRIMARY      <derived3>  ALL    NULL           NULL    NULL     NULL 24185  Using where; Using join buffer
  1    PRIMARY      <derived4>  ALL    NULL           NULL    NULL     NULL 343438
  4    DERIVED      casedocs    index  NULL           caseNum 62       NULL 768024 Using index
  3    DERIVED      casses_rows ALL    NULL           NULL    NULL     NULL 29872  
  2    DERIVED      rowsall     ref    RowNum         RowNum  4             6500   Using where

正如您所看到的,我将表3分组以防止连接在结果中创建重复行 - 实际上第三个连接是测试是否存在与该案例对应的文档(将为NULL)。

更多信息:

  • 如果我删除第三个加入,则查询会 1秒
  • 如果我只执行第三个连接选择语句,则需要 0.003秒
  • 在对查询进行概要分析时,&#34;发送数据&#34;是99.9%的时间。

任何想法为什么执行第三次连接需要这么长时间????

完成任务! 感谢@Turophile和@Joel Coehoorn,新的测试结果是0.004秒!

以下是finall查询:

SELECT DISTINCT A.`id` AS idRowCase, C.`caseNum` AS isPaperAva, A.`rowCaseType` ,  A.`fullCaseNumber` , A.`rowProcedure` , B.`caseFullName` , A.`rowCourt` , A.`rowAmount` , A.`rowResultTaken` , A.`rowStatus` , A.`rowIntrest` , A.`DateOpen` , A.`DateProccess` , A.`rowDateClose` , A.`rowJudgeTypeDate` 

FROM  `rowsall` A
INNER JOIN  `casses_rows` B ON A.`masterCaseId` = B.`id` 
LEFT JOIN  `casedocs` C ON A.`fullCaseNumber` = C.`caseNum` 
WHERE A.`rowJudgeTypeName` LIKE  '%@value1%'
AND A.`RowNum` =  '1'

2 个答案:

答案 0 :(得分:2)

我的建议是不要不必要地排序和分组。所以,像这样:

SELECT 
   A.`id` AS idRowCase, 
   C.`caseNum` AS isPaperAva, 
   A.`rowCaseType`, 
   A.`fullCaseNumber`, 
   A.`rowProcedure`, 
   B.`caseFullName`, 
   A.`rowCourt`, 
   A.`rowAmount`, 
   A.`rowResultTaken`, 
   A.`rowStatus`, 
   A.`rowIntrest`,
   A.`DateOpen` ,
   A.`DateProccess`, 
   A.`rowDateClose`, 
   A.`rowJudgeTypeDate` 

FROM `rowsall` AS A 
INNER JOIN `casses_rows` AS B 
      ON A.`masterCaseId` = B.`id` 
LEFT JOIN `casedocs` AS C 
      ON A.`fullCaseNumber` = C.`caseNum`
WHERE `rowJudgeTypeName` LIKE '%@value1%' 
AND   `RowNum` ='1' 

(如果caseNum不是唯一的话,可能会返回不同的结果(多行)。

您还可以将LEFT JOIN转换为子选择:

SELECT 
   A.`id` AS idRowCase, 
   A.`fullCaseNumber` AS isPaperAva, 
   A.`rowCaseType`, 
   A.`fullCaseNumber`, 
   A.`rowProcedure`, 
   B.`caseFullName`, 
   A.`rowCourt`, 
   A.`rowAmount`, 
   A.`rowResultTaken`, 
   A.`rowStatus`, 
   A.`rowIntrest`,
   A.`DateOpen` ,
   A.`DateProccess`, 
   A.`rowDateClose`, 
   A.`rowJudgeTypeDate` 

FROM `rowsall` AS A 
INNER JOIN `casses_rows` AS B 
      ON A.`masterCaseId` = B.`id` 
WHERE `rowJudgeTypeName` LIKE '%@value1%' 
AND   `RowNum` ='1' 
AND   A.`fullCaseNumber` in (SELECT `caseNum` FROM `casedocs` ) 

但这表明使用表casedocs有点多余 - 是否真的需要它?

答案 1 :(得分:1)

首先,前两个表根本不需要子查询。这可以通过连接条件和WHERE子句直接更好地表达。

此外,最后一次加入使用子查询和组:

  

LEFT JOIN(SELECT caseNum FROM casedocs GROUP BY caseNum ORDER BY NULL)

这破坏了MySql在计算最后一次连接时使用任何索引的能力。如果您可以重新编写此表以首先加入表,并在外部查询中执行GROUP BY,以便获得相同的结果,它可能会更好地执行更多,因为您将会最好使用索引。

SELECT 
       A.`id` AS idRowCase, 
       C.`caseNum` AS isPaperAva, 
       A.`rowCaseType`, 
       A.`fullCaseNumber`, 
       A.`rowProcedure`, 
       B.`caseFullName`, 
       A.`rowCourt`, 
       A.`rowAmount`, 
       A.`rowResultTaken`, A.`rowStatus`, A.`rowIntrest` ,A.`DateOpen` ,A.`DateProccess`, A.`rowDateClose`, A.`rowJudgeTypeDate` 

FROM `rowsall` A 
INNER JOIN `casses_rows` B   ON A.`masterCaseId` = B.`id` 
LEFT JOIN (SELECT `caseNum` FROM `casedocs` GROUP BY `caseNum` ) C ON c.`caseNum` = A.`fullCaseNumber`
WHERE A.`rowJudgeTypeName` LIKE '%@value1%' AND A.`RowNum` ='1'