我确信我一定会遗漏一些明显的东西。我试图将两个具有不同测量数据的表对齐以进行分析,当我将两个表连接在一起时,计数又大大增加了。
这是我的表1
中的正确计数select line_item_id,sum(is_imp) as imps
from table1
where line_item_id=5993252
group by 1;
这是表2
中的正确计数select cs_line_item_id,sum(grossImpressions) as cs_imps
from table2
where cs_line_item_id=5993252
group by 1;
当我将表连接在一起时,我的计数变得不准确:
select a.line_item_id,sum(a.is_imp) as imps,sum(c.grossImpressions) as cs_imps
from table1 a join table2 c
ON a.line_item_id=c.cs_line_item_id
where a.line_item_id=5993252
group by 1;
答案 0 :(得分:2)
select a.*, b.imps table2_imps from
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)a
join
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)b
on a.select line_item_id=b.select line_item_id
答案 1 :(得分:1)
您正在为每个line_item_id
生成笛卡尔积。有两种相对简单的方法可以解决此问题,一种方法是使用full join
,另一种方法是使用union all
:
select line_item_id, sum(imps) as imps, sum(grossImpressions) as cs_imps
from ((select a.line_time_id, sum(is_imp) as imps, 0 as grossImpressions
from table1 a
where a.line_item_id = 5993252
group by a.line_item_id
) union all
(select c.line_time_id, 0 as imps, sum(grossImpressions) as grossImpressions
from table2 c
where c.line_item_id = 5993252
group by c.line_item_id
)
) ac
group by line_item_id;
您可以从子查询中删除where
子句,以获取所有line_tiem_id
的总数。请注意,即使对于给定的line_item_id
,一个或另一个表都没有匹配的行,这也可以工作。
为了提高性能,您确实想在group by
之前 进行过滤。