星型模式聚合问题

时间:2014-12-31 21:56:17

标签: mysql sql group-by aggregation star-schema

我有两个维度表和一个事实表如下:

drop table if exists ref;
create table ref (country_id int not null, id_ref int not null);

insert into ref values(1,1);
insert into ref values(1,2);

drop table if exists conv;
create table conv (country_id int not null, id_ref int not null,id_conv int not null,item varchar(25));

insert into conv values (1,1,1,'AA');
insert into conv values (1,2,2,'CC');
insert into conv values(1,2,3,'CA');
insert into conv values(1,2,4,'CA');

drop table if exists fact;
create table fact as
select 
r.country_id,c.item,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs
 from ref r
left join conv c
on r.country_id=c.country_id
and r.id_ref=c.id_ref
group by 1,2;

查询以获得结果:

select f.country_id, sum(f.refs) refs,sum(f.convs) convs
from fact f
group by 1;

以上查询的结果是 1,3,4

但我期待 1,2,4

我如何达到预期的结果或我的概念是错的?

2 个答案:

答案 0 :(得分:0)

我认为你有错误:

create table fact as
select 
r.country_id,c.item,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs
 from ref r
left join conv c
on r.country_id=r.country_id
and r.id_ref=c.id_ref
group by 1,2;

请尝试

left join conv c
    on r.country_id=c.country_id
    and r.id_ref=c.id_ref

而不是

left join conv c
    on r.country_id=r.country_id
    and r.id_ref=c.id_ref

(以下部分看起来像一个错误r.country_id=r.country_id - 一个永远真实的表达式)

答案 1 :(得分:0)

此查询的异常是错误的。 你是根据国家加入两张桌子的。他们将是4匹配记录。按国家和项目分组后,将有三个record.summarize不同的refid与项目。实际结果是正确的。

country_id  item    refs    convs
1   AA  1   1
1   CA  1   2
1   CC  1   1

对于您的期望,查询将是



select 
r.country_id,
count(distinct r.id_ref) refs,
count(distinct c.id_conv) convs 
 from ref r
left join conv c
on r.country_id=c.country_id
and r.id_ref=c.id_ref
group by  r.country_id