两个内连接最适合优化查询

时间:2017-08-19 14:18:59

标签: sql oracle join optimization

我刚从学校接受挑战,优化此查询这是理论问题

挑战:

SELECT TO_CHAR(CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date"),'YYYY-MM') AS "date_month",
COUNT(DISTINCT CASE WHEN (tableB."date" IS NOT NULL) THEN tableB._id ELSE NULL END) AS "tableB.countB",
COUNT(DISTINCT CASE WHEN (tableC."date" IS NOT NULL) THEN tableC._id ELSE NULL END) AS "tableC.countC"
FROM tableA AS tableA
LEFT JOIN tableB AS tableB ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableB."date"))) = (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date")))
LEFT JOIN tableC AS tableC ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableC."date"))) = (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date")))
WHERE tableA."date" >= CONVERT_TIMEZONE ('America/Los_Angeles','UTC',DATEADD (month,-17,DATE_TRUNC('month',DATE_TRUNC('day',CONVERT_TIMEZONE ('UTC','America/Los_Angeles',GETDATE ()))))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

对于优化,我只是删除上述查询中的case语句我认为这也会提高查询效率

SELECT    To_char(Convert_timezone ('UTC','America/Los_Angeles',tablea."date"),'YYYY-MM') AS "date_month", 
          Count(DISTINCT 
           decode(tableb."date", not null,tableb._id,null)
           AS "tableB.countB",
          Count(DISTINCT 
           decode(tablec."date", not null,tablec._id ,null)
            AS "tableC.countC"  
FROM      tablea AS tablea 
LEFT JOIN tableb AS tableb 
ON        ( 
                    Date (Convert_timezone ('UTC','America/Los_Angeles',tableb."date"))) = (Date (Convert_timezone ('UTC','America/Los_Angeles',tablea."date")))
LEFT JOIN tablec AS tablec 
ON        ( 
                    Date (Convert_timezone ('UTC','America/Los_Angeles',tablec."date"))) = (Date (Convert_timezone ('UTC','America/Los_Angeles',tablea."date")))
WHERE     tablea."date" >= convert_timezone ('America/Los_Angeles','UTC',Dateadd (month,-17,Date_trunc('month',Date_trunc('day',Convert_timezone ('UTC','America/Los_Angeles',Getdate ())))) group BY 1 ORDER BY 1 DESC limit 500;

如果我们删除一个左连接并合并语句,您的建议是什么 是优秀的

2 个答案:

答案 0 :(得分:1)

...或者,使用一个更短的别名,实际上使SQL更短更干净。这也有助于阅读能力。另外,将其格式化为单独的子句(Select,From,Join,Where,Order By,Group by,Having等等),以便它们易于分离并与眼睛区分开来。并使用与支持的逻辑结构一致的缩进,以及不妨碍,你能够将这些部分彼此分开 举个例子,这是您的第一个SQL查询重新格式化,但逻辑结构与您发布的相同:

SELECT TO_CHAR(CONVERT_TIMEZONE ('UTC','America/Los_Angeles', a.date),'YYYY-MM') date_month,
   COUNT(DISTINCT CASE WHEN (b."date" IS NOT NULL) THEN b._id ELSE NULL END) countB,
   COUNT(DISTINCT CASE WHEN (c."date" IS NOT NULL) THEN c._id ELSE NULL END) countC
FROM tableA a   
  LEFT JOIN tableB b 
     ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',b.date))) = 
        (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',a.date)))
  LEFT JOIN tableC c 
     ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',c.date))) = 
        (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',a.date)))
WHERE a.date >= CONVERT_TIMEZONE ('America/Los_Angeles', 'UTC', 
       DATEADD (month,-17,DATE_TRUNC('month', 
       DATE_TRUNC('day',CONVERT_TIMEZONE ('UTC','America/Los_Angeles', 
                        GETDATE ()))))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

这是优化版

SELECT DatePart(month, a.Date-8/24) date_month, 
  sum(case when b.date is Not null then 1 else 0 end) countb,
  sum(case when c.date is Not null then 1 else 0 end) countc,
FROM tableA a    
  LEFT JOIN tableB b 
     ON b.Date = a.Date -- Timezone offsets are not necessary, 
  LEFT JOIN tableC c  
     ON c.date = a.date -- both in same timezone 
WHERE a.date >= DateAdd(hour, 8,
            DATEADD (month,-17,DATE_TRUNC('month', 
             GETDATE () ))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

答案 1 :(得分:1)

据推测,_id列是唯一的。所以:

SELECT TO_CHAR(CONVERT_TIMEZONE('UTC','America/Los_Angeles', a."date"), 'YYYY-MM') AS date_month,
       SUM(CASE WHEN b."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableB_countB,
       SUM(CASE WHEN c."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableC_countC
FROM tableA a LEFT JOIN
     tableB b
     ON DATE(CONVERT_TIMEZONE ('UTC', 'America/Los_Angeles', b."date")) = DATE(CONVERT_TIMEZONE ('UTC', 'America/Los_Angeles', b."date")) LEFT JOIN
     tableC c
      ON DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', c."date")) = DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', a."date")
WHERE a."date" >= CONVERT_TIMEZONE('America/Los_Angeles', 'UTC',
                                   DATEADD(month, -17, DATE_TRUNC('month', DATE_TRUNC('day', CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE ()))
GROUP BY 1
ORDER BY 1 DESC
LIMIT 500;

然后,ON子句中的日期转换似乎没有必要,因为双方正在从同一时区转换。如果值没有时间组件(如date之类的名称所示),则不需要DATE()

SELECT TO_CHAR(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', a."date"), 'YYYY-MM') AS date_month,
       SUM(CASE WHEN b."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableB_countB,
       SUM(CASE WHEN c."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableC_countC
FROM tableA a LEFT JOIN
     tableB b
     ON b."date" = b."date" LEFT JOIN
     tableC c
      ON c."date" = a."date"
WHERE a."date" >= CONVERT_TIMEZONE('America/Los_Angeles', 'UTC',
                                   DATEADD(month, -17, DATE_TRUNC('month', DATE_TRUNC('day', CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE ()))
GROUP BY 1
ORDER BY 1 DESC
LIMIT 500;

WHERE条款没问题。它可以利用a(date)上的索引。