SQL - 如何将不同表中的2个日期与连接组合而不会出错

时间:2018-04-06 09:40:54

标签: sql join union amazon-redshift

我的公司,有2张桌子可以载有活动信息和网店信息。

基本信息

在活动表格中,它包含如下信息:

CAMPAIGN_NAME  CREATION_DATE  NUM_DELIVERED  NUM_ERRORS
Promotion 101   2013-01-05      100,000        100
Promotion 105   2013-01-05      135,000        200
Promotion 104   2013-01-05      125,000         0
Promotion 103   2013-01-06      50,000          0

在网上商店,它带有这样的信息

VISIT_KEY    VISIT_AT  .....
 100200     2013-01-05
 105235     2013-01-05
 103050     2013-01-05

期望的结果

我们希望建立一个表格,以显示每天的效果,如

CREATION_DATE    VISIT_AT   NUM_DELIVERED  NUM_VISITS
 2013-01-05     2013-01-05    260,000        30,000
 2013-01-06     2013-01-06     50,000          0 

接近前后
收集信息之前,我们使用的是union方法,它首先在单独的表中进行聚合,而UNION ALL则用于另一个,

SELECT 
   campaign_date,
   visit_date,
   SUM(delivered),
   SUM(visits)
FROM
    ((Select
       CREATION_DATE::DATE as campaign_date,
       '1970-01-01'::DATE as visit_date
       SUM(NUM_DELIVERED) as delivered
       0 AS visits
    FROM 
       campaign
    GROUP BY 1,2)
    UNION ALL
    (Select
       '1970-01-01'::Date AS campaign_date,
       VISIT_AT::DATE AS visit_date
       0 AS delivered
       COUNT(VISIT_KEY) AS visits
    FROM 
       campaign
    GROUP BY 1,2))
 GROUP BY 1,2

看起来像这样

campaign_date visit_date   delivered      visits
2013-01-05    1970-01-01    260,000          0
1970-01-01    2013-01-05       0          30,000
2013-01-06    1970-01-01     50,000          0     

现在我尝试在广告系列上结合左连接.CREATION_DATE = webshop.VISIT_AT,如下所示:

Select 
  campaign.CREATION_DATE as campaign_date, 
  webshop.VISIT_AT as visits,
  SUM(campaign.NUM_DELIVERED) as delivered,
  COUNT(webshop.VISIT_KEY) AS visits
FROM 
  webshop LEFT JOIN campaign ON webshop.VISIT_AT = campaign.CREATION_DATE

但这个数字完全不同......

问题

1,此查询中可能出现的错误是什么?因为我想得到相同的信息,应该期待相同的结果......

2,我怎样才能达到预期的效果?

供您参考,我使用的是亚马逊红移。

非常感谢您的帮助,并祝周末愉快!

2 个答案:

答案 0 :(得分:2)

解决您的问题:

使用DISTINCT

SELECT DISTINCT c.Creation_Date, 
       c.Creation_Date AS Visit_At, 
       c.Num_Delivered, 
       c.Num_Visits
FROM (
      SELECT c.Creation_Date, 
             SUM(c.Num_Delivered) AS Num_Delivered, 
             SUM(c.Num_Errors) AS Num_Visits
      FROM Campaign AS c
      GROUP BY c.Creation_Date
     ) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At

您可以使用GROUP BY代替DISTINCT

SELECT c.Creation_Date, c.Creation_Date AS Visit_At, c.Num_Delivered, c.Num_Visits
FROM (
      SELECT c.Creation_Date, SUM(c.Num_Delivered) AS Num_Delivered, SUM(c.Num_Errors) AS Num_Visits
      FROM Campaign AS c
      GROUP BY c.Creation_Date
     ) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At
GROUP BY c.Creation_Date, c.Creation_Date , c.Num_Delivered, c.Num_Visits

<强>输出:

Creation_Date   Visit_At     Num_Delivered  Num_Visits
2013-01-05      2013-01-05   360000          30000
2013-01-06      2013-01-06   50000           0

链接演示:

  

http://sqlfiddle.com/#!9/22fe0/1

答案 1 :(得分:2)

一种方法使用union allgroup by

select dte, sum(num_delivered) as num_delivered, sum(num_visits) as num_visits
from ((select creation_date as dte, sum(num_delivered) as num_delivered, 0 as num_visits
       from campaign
       group by creation_date
      ) union all
      (select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
       from webshop
       group by visit_at
      )
     ) cw
group by dte
order by dte;

我认为没有理由有两个日期列。

汇总后的另一种选择是full outer join

select coalesce(creation_date, visit_at) as dte,
       coalesce(num_delivered, 0) as num_delivered, 
       coalesce(num_visits, 0) as num_visits
from (select creation_date, sum(num_delivered) as num_delivered, 0 as num_visits
      from campaign
      group by creation_date
     ) c full outer join
     (select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
      from webshop
      group by visit_at
     ) 
     on w.visit_at = c.creation_dte
order by dte;