我正在试图弄清楚我想要做的查询在SQL中是否完全可行或可行,或者我是否需要收集原始数据并在我的应用程序中处理它。
我的架构如下所示:
applications
================
id INT
application_steps
=================
id INT
application_id INT
step_id INT
activated_at DATE
completed_at DATE
steps
=====
id INT
step_type_id INT
理想情况下,此数据位于application_steps
:
| id | application_id | step_id | activated_at | completed_at |
| 1 | 1 | 1 | 2013-01-01 | 2013-01-02 |
| 2 | 1 | 2 | 2013-01-02 | 2013-01-02 |
| 3 | 1 | 3 | 2013-01-02 | 2013-01-10 |
| 4 | 1 | 4 | 2013-01-10 | 2013-01-11 |
| 5 | 2 | 1 | 2013-02-02 | 2013-02-02 |
| 6 | 2 | 2 | 2013-02-02 | 2013-02-07 |
| 7 | 2 | 4 | 2013-02-09 | 2013-02-11 |
我想得到这个结果:
| application_id | step_1_days | step_2_days | step_3_days | step_4_days |
| 1 | 1 | 0 | 8 | 1 |
| 2 | 0 | 5 | NULL | 2 |
请注意,实际上我会看到更多步骤和更多应用程序。
如您所见,applications
和application_steps
之间存在 has-many 关系。给定步骤也可能不用于特定应用。我想得到每个步骤花费的时间(使用DATEDIFF(completed_at, activated_at)
),所有这些都在一行中(列名无关紧要)。这有可能吗?
次要问题:为了使事情进一步复杂化,我还需要一个将application_steps
与steps
连接起来的辅助查询,并且只获取具有特定step_type_id
的步骤的数据。假设第一部分是可能的,我该如何将其扩展为有效过滤?
注意:效率在这里是关键 - 这是针对年度报告,相当于约2500 applications
,生产中有70个steps
和44,000 application_steps
(不是大量数据) ,但是当连接被考虑在内时可能会很多。)
答案 0 :(得分:1)
这应该是一个基本的“旋转”聚合:
select id,
max(case when step_id = 1 then datediff(completed_at, activated_at) end) as step_1_days,
max(case when step_id = 2 then datediff(completed_at, activated_at) end) as step_2_days,
max(case when step_id = 3 then datediff(completed_at, activated_at) end) as step_3_days,
max(case when step_id = 4 then datediff(completed_at, activated_at) end) as step_4_days
from application_steps s
group by id;
你必须在所有70个步骤中重复这一步。
仅针对特定类型的步骤执行此操作:
select application_id,
max(case when step_id = 1 then datediff(completed_at, activated_at) end) as step_1_days,
max(case when step_id = 2 then datediff(completed_at, activated_at) end) as step_2_days,
max(case when step_id = 3 then datediff(completed_at, activated_at) end) as step_3_days,
max(case when step_id = 4 then datediff(completed_at, activated_at) end) as step_4_days
from application_steps s join
steps
on s.step_id = steps.id and
steps.step_type_id = XXX
group by application_id;