我正在尝试按版本类型计算每周出现在我的数据中的不同ID,我不确定如何正确构建查询。
我希望按照以下方式制作一张表:
1.1 1.2 1.3 1.4
wk1 1 5 4 8
wk2 4 3 9 8
wk3 1 8 0 6
我尝试制作下面的查询,但它不会运行,因为它需要group by中的Case语句,然后不会接受count()。
SELECT
Case when version like "1.1%" then Count(distinct ID)
when version like "1.2%" then Count(distinct ID)
when version like "1.3%" then Count(distinct ID)
when version like "1.4%" then Count(distinct ID) end,
CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
FROM db.table
where timestamp_pst >= "2016-01-28"
group by CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT)
order by week_of_the_year
答案 0 :(得分:1)
SELECT
COUNT(DISTINCT (CASE WHEN version like '1.1%' THEN ID END)) as '1.1'
,COUNT(DISTINCT (CASE WHEN version like '1.2%' THEN ID END)) as '1.2'
,COUNT(DISTINCT (CASE WHEN version like '1.3%' THEN ID END)) as '1.3'
,COUNT(DISTINCT (CASE WHEN version like '1.4%' THEN ID END)) as '1.4'
CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
FROM aws_d3.iaanalytics_detail
where timestamp_pst >= "2016-01-28"
group by CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT)
order by week_of_the_year
您想要使用"条件聚合" 。这样做case语句实际上是在聚合函数内部。因为您想要COUNT(DISTINCT)
,您实际上需要通过利用聚合中的DISTINCT
关键字或通过创建派生表来实现这一点,因此只有不同的值存在,因为另一个答案建议但是只有这样才能让你免于重复DISTINCT
我没有看到使用派生表使问题复杂化的必要性。
请注意,SUM(CASE WHEN blah THEN 1 ELSE 0 END)
将 NOT 为您工作,因为这会对所有出现次数求和,而不会计算不同的值。此外,聚合函数会忽略空值,如果不包含ELSE
语句,案例表达式的值如果不匹配则为NULL
。
答案 1 :(得分:0)
您可以将COUNT()
聚合函数与条件CASE
语句一起使用。
SELECT
week_of_the_year
, COUNT(CASE WHEN version LIKE '1.1%' THEN id END) AS v1_1
, COUNT(CASE WHEN version LIKE '1.2%' THEN id END) AS v1_2
, COUNT(CASE WHEN version LIKE '1.3%' THEN id END) AS v1_3
, COUNT(CASE WHEN version LIKE '1.4%' THEN id END) AS v1_4
FROM (
SELECT
DISTINCT
id
, version
, CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
FROM aws_d3.iaanalytics_detail
where timestamp_pst >= '2016-01-28'
) t
GROUP BY week_of_the_year
ORDER BY week_of_the_year
请注意,查询的DISTINCT
部分发生在派生表t
中。实际上不需要派生表,但我发现它是一个更清晰的解决方案,因为GROUP BY
子句不重复相同的代码并使其更具可读性。这也引入了不在聚合中完成的不同部分。
答案 2 :(得分:0)
试试这个
SELECT
SUM(Case when version like "1.1%" then 1 ELSE 0 END) as '1.1',
SUM(Case when version like "1.2%" then 1 ELSE 0 END) as '1.2',
SUM(Case when version like "1.3%" then 1 ELSE 0 END) as '1.3',
SUM(Case when version like "1.4%" then 1 ELSE 0 END) as '1.4',
CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
FROM aws_d3.iaanalytics_detail
where timestamp_pst >= "2016-01-28"
group by CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT)
order by week_of_the_year