我正在尝试制作一个相当复杂的查询。我有一个带有块的数据库。 每个块都有一个开始日期,结束日期和它所属的模块。 我必须计算营业额,这将是连续块之间的差异(对于块[i]):
阻止[i] .start - 阻止[i - 1] .end
让我们举几个例子,我有这些数据:
create table blocks (start datetime, end datetime, module integer);
insert into blocks (start, end, module)
values
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 1), -- diff: null or 0
('2016-04-13 11:00:00', '2016-04-13 12:00:00', 1), -- diff: 1hour
('2016-04-13 12:30:00', '2016-04-13 14:00:00', 1), -- diff: 30minutes
-- turnoverAvg: 45min = (1h + 30min) / 2
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 2), -- diff: null or 0
('2016-04-13 12:00:00', '2016-04-13 12:30:00', 2), -- diff: 2hour
('2016-04-13 13:30:00', '2016-04-13 14:30:00', 2), -- diff: 1hour
-- turnoverAvg: 90min = (2h + 1h) / 2
('2016-04-14 14:30:00', '2016-04-14 16:00:00', 2), -- diff: null or 0
('2016-04-14 17:00:00', '2016-04-14 18:00:00', 2), -- diff: 1hour
-- turnoverAvg: 60min = 1h/1
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 3), -- diff: null or 0
('2016-04-13 10:00:00', '2016-04-13 11:00:00', 3), -- diff: 0
('2016-04-13 12:00:00', '2016-04-13 13:00:00', 3), -- diff: 1hour
('2016-04-13 14:00:00', '2016-04-13 15:00:00', 3), -- diff: 1hour
('2016-04-13 16:00:00', '2016-04-13 17:00:00', 3), -- diff: 1hour
-- turnoverAvg: 45min = (0 + 1h + 1h + 1h) / 4
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 4), -- diff: null or 0
-- turnoverAvg: null
('2016-04-13 09:00:00', '2016-04-13 15:00:00', 5), -- diff: null or 0
('2016-04-13 19:00:00', '2016-04-13 20:00:00', 5); -- diff: 4hour
-- turnoverAvg: 240min = 4h/1
我应该进行以下查询(伪代码):
SELECT turnoverAVG (rows of each group by)
FROM blocks
GROUP BY DATE (start), module
其中turnoverAvg将是这样的函数(伪代码):
function turnoverAVG(rows):
acc = 0.0
for(i=1; i < rows.length; i++)
d = row[i].start - rows[i - 1].end
acc += d
return acc/(rows.length - 1)
其实我尝试了很多东西,但我不知道从哪里开始......如果有人有想法,我会非常感激。
修改
输出类似于:
turnoverAVG, module, day
45min, 1, 2016-04-13
1:30hour, 2, 2016-04-13
1hour, 2, 2016-04-14 -- different day but same module
45min, 3, 2016-04-13
4hour, 5, 2016-04-13
如果是在几分钟内,周转率AVG会很好,但我已经用这种方式写了它以便更好地理解它。正如您所看到的,它从不计算第一个块,因为它不能用前一个块减去(没有前一个块)。
答案 0 :(得分:1)
这样的函数称为window functions。它们仅从MySQL 8开始提供。
在此之前,您必须找到另一种方法来编写查询,例如, this question。大多数情况下,你会通过使用变量来实现,尽管sql方法是使用连接。
但是在你的具体情况下,你实际上并不需要这些:转换时不仅是模块之间的总和(你需要知道前一行),还有一天的开始和结束之间的时间(您只需要min
和max
)减去运行模块的时间(您不需要前一行)。
所以试试这个:
select
module,
date(start),
case when count(module) > 1
then (TIMESTAMPDIFF(Minute,min(start),max(end)) -
sum(TIMESTAMPDIFF(Minute,start, end)))
/ (count(module) - 1)
else null
end as turnoverAVG,
-- details, just for information:
TIMESTAMPDIFF(Minute,min(start),max(end)) as total_day,
sum(TIMESTAMPDIFF(Minute,start, end)) as module_duration,
TIMESTAMPDIFF(Minute,min(start),max(end)) -
sum(TIMESTAMPDIFF(Minute,start, end)) as turnover,
count(module) as cnt
from blocks
group by date(start), module;
另外4个列就在那里,因此您可以看到计算中使用的不同termn,您可以删除它们。
所有模块都需要在同一天开始和结束(尽管您可以简单地修改它以支持隔夜模块)。如果模块重叠,它也不会纠正时间(但你的伪代码也不是这样)。
目前还不完全清楚是否要包含仅包含一个模块的日期(如模块4的注释中所示)(如示例输出中所示)。如果您要排除这些内容,可以添加以下内容: <{1}}在查询结束时。