查询优化 - 10001

时间:2014-11-11 23:20:55

标签: sql postgresql query-optimization

我需要帮助优化此特定查询。如果您看到有多个子查询在运行,但它们在同一个表上运行。问题是GROUP BY由2个子查询使用,还有2个其他子查询不使用GROUP BY。可以将这4个子查询组合在一起,只扫描一次表。

WITH f AS
(
  SELECT a.custom_referal_page,
     a.campaign_id,
     a.domain_user_id,
     a.event_name,
     COUNT(a.event_name),
     (SELECT COUNT(a1.event_name)
      FROM action_fact_new_wodim a1
      WHERE a1.domain_url = 'alternativeapparel.com'
      AND   a1.event_name = 'AltOrig PerfCapSl GoToPro'
     -- AND   a1.date_stamp BETWEEN '20140501' AND '20140530'          
     AND   a1.time_stamp BETWEEN '2014-05-01 14:43:15' AND '2014-05-30 14:43:15'
      AND   event_name != 'Page load'
      AND   event_name != 'Page unload') AS totalCount,
     (SELECT COUNT(domain_user_id)
      FROM (SELECT DISTINCT a1.domain_user_id,
                   a1.custom_referal_page
            FROM action_fact_new_wodim a1
            WHERE a1.domain_url = 'alternativeapparel.com'
            AND   a1.event_name = 'AltOrig PerfCapSl GoToPro'
      --            AND   a1.date_stamp BETWEEN '20140501' AND '20140530'          
           AND   a1.time_stamp BETWEEN '2014-05-01 14:43:15' AND '2014-05-30 14:43:15'
            AND   event_name != 'Page load'
            AND   event_name != 'Page unload') AS a2) AS uniqueCount,
     (SELECT COUNT(domain_user_id)
      FROM (SELECT DISTINCT domain_user_id
            FROM action_fact_new_wodim a1
            WHERE a1.domain_url = 'alternativeapparel.com'
            AND   a1.event_name = 'AltOrig PerfCapSl GoToPro'
       --     AND   a1.date_stamp BETWEEN '20140501' AND '20140530'          
            AND   a1.time_stamp BETWEEN '2014-05-01 14:43:15' AND '2014-05-30 14:43:15'
            AND   event_name != 'Page load'
            AND   event_name != 'Page unload') AS a2) AS totalUniqueCount
FROM action_fact_new_wodim a
WHERE a.domain_url = 'alternativeapparel.com'
AND   a.event_name = 'AltOrig PerfCapSl GoToPro'
-- AND   a.date_stamp BETWEEN '20140501' AND '20140530'          
AND   a.time_stamp BETWEEN '2014-05-01 14:43:15' AND '2014-05-30 14:43:15'
AND   event_name != 'Page load'
AND   event_name != 'Page unload'
GROUP BY a.custom_referal_page,
       a.campaign_id,
       a.domain_user_id,
       a.event_name
)

SELECT custom_referal_page,
    campaign_id,
    SUM(COUNT) AS COUNT,
    MAX(totalCount) AS totalCount,
    COUNT(uniqueCount) AS uniqueCount,
    MAX(totalUniqueCount) AS totalUniqueCount
FROM f
GROUP BY custom_referal_page,
    campaign_id
ORDER BY 3 DESC

输出:     custom_referal_page campaign_id count totalcount uniquecount totaluniquecount     https://www.google.ca/ null 10838 20153 5346 9906     https://www.google.com/ null 3040 20153 1727 9906

1 个答案:

答案 0 :(得分:0)

你的查询非常疯狂,也许通过退回这个选项可能适合你。你所有的条件都是一样的,你只是得到了不同的数量。您应该能够获得与该组一起通过表格的不同计数和总计数,例如......

select
      a1.custom_referal_page,
      a1.campaign_id,
      COUNT(*) as TotalEvents,
      COUNT( distinct a1.event_name ) as CntDistEvents,
      COUNT( distinct a1.domain_user_id + a1.custom_referal_page ) as CntDistDomRef,
      COUNT( distinct a1.domain_user_id ) as CntDistUsers
   FROM 
      action_fact_new_wodim a1
   WHERE 
          a1.domain_url = 'alternativeapparel.com'
      AND a1.event_name = 'AltOrig PerfCapSl GoToPro'
      AND a1.time_stamp BETWEEN '2014-05-01 14:43:15' AND '2014-05-30 14:43:15'
   group by
      a1.custom_referal_page,
      a1.campaign_id

那就是说,我会在你的表上有一个索引(domain_url,event_name,time_Stamp),但是如果你实际上要查询date_stamp,那么调整index和where子句。

查看这些为您提供不同参考的数字,并根据有意义添加订单......