我有一个看起来像这样的表:
control=# select * from animals;
age_range | weight | species
-----------+--------+---------
0-9 | 1 | lion
0-9 | 2 | lion
10-19 | 2 | tiger
10-19 | 3 | horse
20-29 | 2 | tiger
20-29 | 2 | zebra
我执行一个查询,总结年龄范围组内动物的权重,我只想返回上面聚合权重的行 一定数量。
摘要查询:
SELECT
age_range,
SUM(animals.weight) AS weight,
COUNT(DISTINCT animals.species) AS distinct_species
FROM animals
GROUP BY age_range
HAVING SUM(animals.weight) > 3;
摘要结果:
age_range | weight | distinct_species
-----------+--------+------------------
10-19 | 5 | 2
20-29 | 4 | 2
现在,这就是问题所在。除了这个总结,我想报告用于创建上述汇总行集的物种的不同数量。为简单起见,我们将此数字称为“与众不同的物种总数”。在这个简单的例子中,由于只使用了3种(虎,斑马,马)来产生这一摘要的2行,而不是“狮子”,“不同的物种总数”#39;应该是3.但我无法弄清楚如何成功查询该数字。由于摘要查询必须使用having子句才能将过滤器应用于已分组和聚合的行集,因此在尝试查询“Distinct Species Total”时会出现问题。
这会返回错误的数字2,因为它错误地是非重复计数的唯一计数:
SELECT
COUNT(DISTINCT distinct_species) AS distinct_species_total
FROM (
SELECT
age_range,
SUM(animals.weight) AS weight,
COUNT(DISTINCT animals.species) AS distinct_species
FROM animals
GROUP BY age_range
HAVING SUM(animals.weight) > 3
) x;
当然这会返回错误的数字4,因为它不考虑使用having子句过滤分组和聚合的摘要结果:
SELECT
COUNT(DISTINCT species) AS distinct_species_total
FROM animals;
在这里引导我走上正确道路的任何帮助都表示赞赏,并希望能帮助其他类似问题的人,但最终我确实需要一个适用于Amazon Redshift的解决方案。
答案 0 :(得分:1)
将结果集与原始动物表一起加入并计算不同的物种。
select distinct x.age_range,x.weight,count(distinct y.species) as distinct_species_total
from
(
select age_range,sum(animals.weight) as weight
from animals
group by age_range
having sum(animals.weight) > 3
) x
join animals y on x.age_range=y.age_range