我遇到了这个question,并试图了解answer 给定实际上是在转换数据。
输入表
+---------+-------+---------------+
| user_id | State | Subscriptions |
+---------+-------+---------------+
| 1 | LA | 4 |
| 2 | LA | 4 |
| 3 | LA | 12 |
| 4 | LA | 12 |
| 5 | LA | 8 |
| 6 | LA | 3 |
| 7 | NY | 14 |
| 8 | NY | 15 |
| 9 | NY | 3 |
| 10 | NY | 2 |
| 11 | NY | 4 |
| 12 | NY | 12 |
| 13 | OH | 6 |
| 14 | OH | 8 |
| 15 | OH | 2 |
| 16 | OH | 3 |
+---------+-------+---------------+
输出表
+--------------------+----+----+----+
| Subscription_Range | LA | NY | OH |
+--------------------+----+----+----+
| 1 to 4 | 3 | 3 | 2 |
| 5 to 11 | 1 | 0 | 2 |
| 12 to 15 | 2 | 3 | 0 |
+--------------------+----+----+----+
Gordon Linoff给出的答案:
select (case when subscriptions <= 4 then '1 to 4'
when subscriptions <= 11 then '5 to 11'
when subscriptions <= 15 then '12 to 15'
end) as subscription_range,
sum(case when state = 'LA' then 1 else 0 end) as LA,
sum(case when state = 'NY' then 1 else 0 end) as NY,
sum(case when state = 'OH' then 1 else 0 end) as OH
from t
group by (case when subscriptions <= 4 then '1 to 4'
when subscriptions <= 11 then '5 to 11'
when subscriptions <= 15 then '12 to 15'
end)
order by min(subscriptions);
我想从根本上了解此查询的执行方式。
例如:
Subscriptions
列吗? (因为它是第一个在查询中用case
检查的对象。)1 to 4
。接下来呢?LA
,但我不知道进一步执行的方式。我试图想象聚合之前表格的形成。 SQL是否以行方式运行?像这样,每一行都是从数据库中选取的,并将查询的相应部分应用于每一列? (在这种情况下,像case
应用于Subscriptions
列。)
答案 0 :(得分:1)
第一部分情况是订阅仅生成范围别名的值,而分组别名用于分组依据 这3个部分分别为LA,NY,OH 使用伪造的聚合函数来模拟数据透视表
没有伪造的聚合函数,每个值都放置在不同的行中..通过将具有相同范围的所有行减少到单个行..来获得分组,从而获得所需的方面
select (case when subscriptions <= 4 then '1 to 4'
when subscriptions <= 11 then '5 to 11'
when subscriptions <= 15 then '12 to 15'
end) as subscription_range,
sum(case when state = 'LA' then 1 else 0 end) as LA,
sum(case when state = 'NY' then 1 else 0 end) as NY,
sum(case when state = 'OH' then 1 else 0 end) as OH
from t
group by (case when subscriptions <= 4 then '1 to 4'
when subscriptions <= 11 then '5 to 11'
when subscriptions <= 15 then '12 to 15'
end)
order by min(subscriptions);
答案 1 :(得分:1)
在聚合之前中计算select
子句中的表达式时,您可以想象将为给定的数据集获得下表:
+--------------------+-----+----+----+
| subscription_range | LA | NY | OH |
+--------------------+-----+----+----+
| '1 to 4' | 1 | 0 | 0 |
| '1 to 4' | 1 | 0 | 0 |
| '12 to 15' | 1 | 0 | 0 |
| '12 to 15' | 1 | 0 | 0 |
| '5 to 11' | 1 | 0 | 0 |
| '1 to 4' | 1 | 0 | 0 |
| '12 to 15' | 0 | 1 | 0 |
| '12 to 15' | 0 | 1 | 0 |
| '1 to 4' | 0 | 1 | 0 |
| '1 to 4' | 0 | 1 | 0 |
| '1 to 4' | 0 | 1 | 0 |
| '12 to 15' | 0 | 1 | 0 |
| '5 to 11' | 0 | 0 | 1 |
| '5 to 11' | 0 | 0 | 1 |
| '1 to 4' | 0 | 0 | 1 |
| '1 to 4' | 0 | 0 | 1 |
+--------------------+-----+----+----+
此处,对于数据集中的每一行,第一个case
语句产生一个字符串,随后的case
语句产生一个1
或0
,这取决于是否state
列会验证测试表达式。
汇总后,查询会计算相同的subscription_range
值集和groups by
此数据,以便每个subscription_range
是不同的。
随后,将各列中的其余数值数据通过包含每个sum
语句的case
表达式在每组上求和,得出:
+--------------------+----+----+----+
| subscription_range | LA | NY | OH |
+--------------------+----+----+----+
| 1 to 4 | 3 | 3 | 2 |
| 5 to 11 | 1 | 0 | 2 |
| 12 to 15 | 2 | 3 | 0 |
+--------------------+----+----+----+