我有两个维度表和一个事实表如下:
drop table if exists ref;
create table ref (country_id int not null, id_ref int not null);
insert into ref values(1,1);
insert into ref values(1,2);
drop table if exists conv;
create table conv (country_id int not null, id_ref int not null,id_conv int not null,item varchar(25));
insert into conv values (1,1,1,'AA');
insert into conv values (1,2,2,'CC');
insert into conv values(1,2,3,'CA');
insert into conv values(1,2,4,'CA');
drop table if exists fact;
create table fact as
select
r.country_id,c.item,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs
from ref r
left join conv c
on r.country_id=c.country_id
and r.id_ref=c.id_ref
group by 1,2;
查询以获得结果:
select f.country_id, sum(f.refs) refs,sum(f.convs) convs
from fact f
group by 1;
以上查询的结果是 1,3,4
但我期待 1,2,4
我如何达到预期的结果或我的概念是错的?
答案 0 :(得分:0)
我认为你有错误:
create table fact as
select
r.country_id,c.item,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs
from ref r
left join conv c
on r.country_id=r.country_id
and r.id_ref=c.id_ref
group by 1,2;
请尝试
left join conv c
on r.country_id=c.country_id
and r.id_ref=c.id_ref
而不是
left join conv c
on r.country_id=r.country_id
and r.id_ref=c.id_ref
(以下部分看起来像一个错误r.country_id=r.country_id
- 一个永远真实的表达式)
答案 1 :(得分:0)
此查询的异常是错误的。 你是根据国家加入两张桌子的。他们将是4匹配记录。按国家和项目分组后,将有三个record.summarize不同的refid与项目。实际结果是正确的。
country_id item refs convs 1 AA 1 1 1 CA 1 2 1 CC 1 1
对于您的期望,查询将是
select
r.country_id,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs
from ref r
left join conv c
on r.country_id=c.country_id
and r.id_ref=c.id_ref
group by r.country_id