Question

我遇到了这个question，并试图了解answer 给定实际上是在转换数据。

输入表

+---------+-------+---------------+
| user_id | State | Subscriptions |
+---------+-------+---------------+
|       1 | LA    |             4 |
|       2 | LA    |             4 |
|       3 | LA    |            12 |
|       4 | LA    |            12 |
|       5 | LA    |             8 |
|       6 | LA    |             3 |
|       7 | NY    |            14 |
|       8 | NY    |            15 |
|       9 | NY    |             3 |
|      10 | NY    |             2 |
|      11 | NY    |             4 |
|      12 | NY    |            12 |
|      13 | OH    |             6 |
|      14 | OH    |             8 |
|      15 | OH    |             2 |
|      16 | OH    |             3 |
+---------+-------+---------------+

输出表

+--------------------+----+----+----+
| Subscription_Range | LA | NY | OH |
+--------------------+----+----+----+
| 1 to 4             |  3 |  3 |  2 |
| 5 to 11            |  1 |  0 |  2 |
| 12 to 15           |  2 |  3 |  0 |
+--------------------+----+----+----+

Gordon Linoff给出的答案：

 select (case when subscriptions <= 4 then '1 to 4'
                 when subscriptions <= 11 then '5 to 11'
                 when subscriptions <= 15 then '12 to 15'
            end) as subscription_range,
           sum(case when state = 'LA' then 1 else 0 end) as LA,
           sum(case when state = 'NY' then 1 else 0 end) as NY,
           sum(case when state = 'OH' then 1 else 0 end) as OH
    from t
    group by (case when subscriptions <= 4 then '1 to 4'
                   when subscriptions <= 11 then '5 to 11'
                   when subscriptions <= 15 then '12 to 15'
              end)
    order by min(subscriptions);

我想从根本上了解此查询的执行方式。

例如：

选择第一行时，将首先检查Subscriptions列吗？（因为它是第一个在查询中用case检查的对象。）
检查后，发现应该为其分配1 to 4。接下来呢？
将检查状态列吗？原来是LA，但我不知道进一步执行的方式。我试图想象聚合之前表格的形成。

SQL是否以行方式运行？像这样，每一行都是从数据库中选取的，并将查询的相应部分应用于每一列？（在这种情况下，像case应用于Subscriptions列。）

Answer 1

第一部分情况是订阅仅生成范围别名的值，而分组别名用于分组依据这3个部分分别为LA，NY，OH 使用伪造的聚合函数来模拟数据透视表

没有伪造的聚合函数，每个值都放置在不同的行中..通过将具有相同范围的所有行减少到单个行..来获得分组，从而获得所需的方面

 select (case when subscriptions <= 4 then '1 to 4'
             when subscriptions <= 11 then '5 to 11'
             when subscriptions <= 15 then '12 to 15'
        end) as subscription_range,
       sum(case when state = 'LA' then 1 else 0 end) as LA,
       sum(case when state = 'NY' then 1 else 0 end) as NY,
       sum(case when state = 'OH' then 1 else 0 end) as OH
from t
group by (case when subscriptions <= 4 then '1 to 4'
               when subscriptions <= 11 then '5 to 11'
               when subscriptions <= 15 then '12 to 15'
          end)
order by min(subscriptions);

Answer 2

在聚合之前中计算select子句中的表达式时，您可以想象将为给定的数据集获得下表：

+--------------------+-----+----+----+
| subscription_range | LA  | NY | OH |
+--------------------+-----+----+----+
| '1 to 4'           |   1 |  0 |  0 |
| '1 to 4'           |   1 |  0 |  0 |
| '12 to 15'         |   1 |  0 |  0 |
| '12 to 15'         |   1 |  0 |  0 |
| '5 to 11'          |   1 |  0 |  0 |
| '1 to 4'           |   1 |  0 |  0 |
| '12 to 15'         |   0 |  1 |  0 |
| '12 to 15'         |   0 |  1 |  0 |
| '1 to 4'           |   0 |  1 |  0 |
| '1 to 4'           |   0 |  1 |  0 |
| '1 to 4'           |   0 |  1 |  0 |
| '12 to 15'         |   0 |  1 |  0 |
| '5 to 11'          |   0 |  0 |  1 |
| '5 to 11'          |   0 |  0 |  1 |
| '1 to 4'           |   0 |  0 |  1 |
| '1 to 4'           |   0 |  0 |  1 |
+--------------------+-----+----+----+

此处，对于数据集中的每一行，第一个case语句产生一个字符串，随后的case语句产生一个1或0，这取决于是否state列会验证测试表达式。

汇总后，查询会计算相同的subscription_range值集和groups by此数据，以便每个subscription_range是不同的。

随后，将各列中的其余数值数据通过包含每个sum语句的case表达式在每组上求和，得出：

+--------------------+----+----+----+
| subscription_range | LA | NY | OH |
+--------------------+----+----+----+
| 1 to 4             |  3 |  3 |  2 |
| 5 to 11            |  1 |  0 |  2 |
| 12 to 15           |  2 |  3 |  0 |
+--------------------+----+----+----+

SQL查询如何转换数据？

2 个答案: