使用count和group by时的输出不同

时间:2016-08-10 13:06:08

标签: sql vertica

当试图获取ID数量时,我会在按日分组时与不分组时得到不同的答案。

select cv.CONV_DAY, count(distinct cv.CLICK_ID)
from
    clickcache.click cc
right join(
        select distinct cv.CLICK_ID, cv.CONV_DAY, cv.PIXEL_ID
        from clickcache.CONVERSION cv
        where cv.CLICK_ID IS NOT NULL) cv ON cv.CLICK_ID = cc.ID
where   cc.ADV_ACCOUNT_ID = 25176
    and cv.CONV_DAY between '2016-8-01' AND '2016-08-07' 
    and AMP_CLICK_STATUS_ID = 1
    AND pixel_id IN 
                   (SELECT DISTINCT conversion_pixel_id
                FROM
                    ampx.campaign_event_funnel ef
                JOIN ampx.campaign cp ON
                    cp.id = ef.campaign_id
                    AND cp.campaign_status_id = 1
                WHERE
                    ef.account_id IN(25176)  
                    AND include_optimization = 1 )
group by 1
order by 1 asc

这产生170,这是正确的答案和我想要的。另一方面,这显示157.

select count(distinct cv.CLICK_ID)
from
    clickcache.click cc
right join(
        select distinct cv.CLICK_ID, cv.CONV_DAY, cv.PIXEL_ID
        from clickcache.CONVERSION cv
        where cv.CLICK_ID IS NOT NULL) cv ON cv.CLICK_ID = cc.ID
where   cc.ADV_ACCOUNT_ID = 25176
    and cv.CONV_DAY between '2016-8-01' AND '2016-08-07' 
    and AMP_CLICK_STATUS_ID = 1
    AND pixel_id IN 
                   (SELECT DISTINCT conversion_pixel_id
                FROM
                    ampx.campaign_event_funnel ef
                JOIN ampx.campaign cp ON
                    cp.id = ef.campaign_id
                    AND cp.campaign_status_id = 1
                WHERE
                    ef.account_id IN(25176)  
                    AND include_optimization = 1 )

我的问题是为什么我会遇到这种差异以及如何解决它以获得正确的计数?

谢谢!

1 个答案:

答案 0 :(得分:1)

你的计数依赖于正确的查询,也许你有重复的行?

例如

table1
id name value
1 2 3

table2

id name value
1 4 5
2 6 3
1 6 3

右边连接表值得到结果

select * from table1 a right join table2 b on a.value = b.value 

1 2 3 2 6 3 
1 2 3 1 6 3 

    select count(distinct a.value) 
from (select a.id, a.name, a.value, b.id, b.name, b.value 
from table1 a right join table2 b on a.value = b.value)

result is 1 

    select b.id, count(distinct a.value) 
from (select a.id, a.name, a.value, b.id, b.name, b.value 
from table1 a right join table2 b on a.value = b.value group)
    group by b.id

result is two rows
2 1
1 1 

我的猜测是,你有这个问题。