计算不同属性是键的位置

时间:2015-01-15 10:53:49

标签: sql postgresql

鉴于此表:

create temp table stats (
name text, country text, age integer
)

insert into stats values 

('eric',    'se',   1),
('eric',    'dk',   4),
('johan',   'dk',   6),
('johan',   'uk',   7),
('johan',   'de',   3),
('dan', 'de',   3),
('dan', 'de',   3),
('dan', 'de',   4)

我想知道具有国家或年龄与密钥相同的不同名称的计数。

country age count
se      1   1
de      3   2
de      4   3
dk      4   3
dk      6   2
uk      7   1

有3个不同的名称有country = dk(eric,johan)或age = 4(eric,dan)

所以我的问题是,编写此查询的最佳方法是什么?

我有这个解决方案,但我发现它很难看!

with country as (
 select count(distinct name), country
 from stats
 group by country
),
age as (
 select count(distinct name), age
 from stats
 group by age
),
country_and_age as(
 select count(distinct name), age, country
 from stats
 group by age, country
)
select country, age, c.count+a.count-ca.count as count from country_and_age ca join age a using(age) join country c using(country)

有更好的方法吗?

3 个答案:

答案 0 :(得分:3)

您也可以加入原始表:

SELECT
  s1.country,
  s1.age,
  COUNT(distinct s2.name)
FROM stats s1
JOIN stats s2 ON s1.country=s2.country OR s1.age=s2.age
GROUP by 1, 2;

答案 1 :(得分:1)

从统计信息中选择不同的年龄和国家/地区。对于每个记录计数,您在匹配国家或年龄的记录中找到多少个不同的名称。

select
  country, 
  age,
  (
    select count(distinct name)
    from stats s 
    where s.country = t.country 
    or s.age = t.age
  ) as cnt
from (select distinct country, age from stats) t;

答案 2 :(得分:0)

我个人不喜欢在线查询,所以我会这样做:

SELECT DISTINCT
        *
FROM    ( SELECT    country ,
                    age ,
                    COUNT(*) OVER ( PARTITION BY country ) AS c_cnt ,
                    COUNT(*) OVER ( PARTITION BY age ) AS a_cnt
          FROM      stats
        ) a
WHERE   c_cnt > 0
        OR a_cnt > 0

我不确定Postgres的性能,但在SQL Server中," in-line"慢约3倍(73%vs 27%)