我正在尝试在python中编写一个脚本,从Data表中提取数据并填充Max表。使用Data表中的给定数据,脚本应填充Max表,如下所示。
Max_f
(f代表未来)是当前项目之后的4个项目中的最大值。
Max_p
(p代表过去)是当前之前4个项目中的最大值。
项目2013-08-13 13:19
的示例:
max_f
最多为21,24,28和30。
max_p
最多为25,23,27和26。
前4个max_p
项和最后4个max_f
应该是n / a,因为数据表中没有足够的值来计算相应的最大值。
Data Max
id datetime value | id datetime max_f max_p
1 13-Aug-2013 13:15 25 | 1 13-Aug-2013 13:15 27 n/a
2 13-Aug-2013 13:16 23 | 2 13-Aug-2013 13:16 27 n/a
3 13-Aug-2013 13:17 27 | 3 13-Aug-2013 13:17 26 n/a
4 13-Aug-2013 13:18 26 | 4 13-Aug-2013 13:18 28 n/a
5 13-Aug-2013 13:19 25 | 5 13-Aug-2013 13:19 30 27
6 13-Aug-2013 13:20 21 | 6 13-Aug-2013 13:20 31 27
7 13-Aug-2013 13:21 24 | 7 13-Aug-2013 13:21 31 27
8 13-Aug-2013 13:22 28 | 8 13-Aug-2013 13:22 n/a 26
9 13-Aug-2013 13:23 30 | 9 13-Aug-2013 13:23 n/a 28
10 13-Aug-2013 13:24 31 | 10 13-Aug-2013 13:24 n/a 30
11 13-Aug-2013 13:25 29 | 11 13-Aug-2013 13:25 n/a 31
我一直在尝试使用SELECT
进行INTERVAL
次查询,但我不确定我是否以正确的方式解决问题。
如果有人能指出我正确的方向,那就太好了。
答案 0 :(得分:1)
即使它绝对不是最有效的查询类,它也会这样做;它以4分钟的前后间隔为基础;
INSERT INTO `max` (`datetime`, `max_f`, `max_p`)
SELECT `data`.datetime,
IF(COUNT(DISTINCT f.datetime) < 4, NULL, MAX(f.value)),
IF(COUNT(DISTINCT p.datetime) < 4, NULL, MAX(p.value))
FROM data
LEFT JOIN data f
ON f.datetime > data.datetime
AND f.datetime < DATE_ADD(data.datetime, INTERVAL 5 MINUTE)
LEFT JOIN data p
ON p.datetime < data.datetime
AND p.datetime > DATE_ADD(data.datetime, INTERVAL -5 MINUTE)
GROUP BY data.datetime
答案 1 :(得分:0)
如果您的id
值确实是连续的,则可以执行以下操作:
select d.*,
(case when sum(dnear.id < d.id) = 4
then max(case when dnear.id <= d.id then dnear.value end)
end) as max_p,
(case when sum(dnear.id > d.id) = 4
then max(case when dnear.id >= d.id then dnear.value end)
end) as max_p
from data d left outer join
data dnear
on dnear.id between d.id - 4 and d.id + 4
group by d.id;
答案 2 :(得分:0)
简单子查询:
SELECT Id,
datetime currdatetime,
(SELECT Max(Value) FROM Data WHERE datetime < currdatetime AND (SELECT COUNT(Value) FROM Data WHERE datetime < currdatetime) > 4) as MaxP,
(SELECT Max(Value) FROM Data WHERE datetime > currdatetime AND (SELECT COUNT(Value) FROM Data WHERE datetime > currdatetime) > 4) as MaxF
FROM Data
答案 3 :(得分:0)
如果规范要从之前的四行和之后的四行中获取最大值,而不管是否缺少特定分钟的行,则此查询将返回结果集:
SELECT d.id
, IF( ( SELECT 4 AS count_f
FROM `Data` f
WHERE f.datetime > d.datetime
ORDER BY f.datetime ASC
LIMIT 3,1
)
, GREATEST(
( SELECT f1.value FROM `Data` f1
WHERE f1.datetime > d.datetime
ORDER BY f1.datetime ASC LIMIT 0,1
)
, ( SELECT f2.value FROM `Data` f2
WHERE f2.datetime > d.datetime
ORDER BY f2.datetime ASC LIMIT 1,1
)
, ( SELECT f3.value FROM `Data` f3
WHERE f3.datetime > d.datetime
ORDER BY f3.datetime ASC LIMIT 2,1
)
, ( SELECT f4.value FROM `Data` f4
WHERE f4.datetime > d.datetime
ORDER BY f4.datetime ASC LIMIT 3,1
)
)
, 'n/a'
) AS max_f
, IF( ( SELECT 4 AS count_p
FROM `Data` p
WHERE p.datetime < d.datetime
ORDER BY p.datetime DESC
LIMIT 3,1
)
, GREATEST(
( SELECT p1.value FROM `Data` p1
WHERE p1.datetime < d.datetime
ORDER BY p1.datetime DESC LIMIT 0,1
)
, ( SELECT p2.value FROM `Data` p2
WHERE p2.datetime < d.datetime
ORDER BY p2.datetime DESC LIMIT 1,1
)
, ( SELECT p3.value FROM `Data` p3
WHERE p3.datetime < d.datetime
ORDER BY p3.datetime DESC LIMIT 2,1
)
, ( SELECT p4.value FROM `Data` p4
WHERE p4.datetime < d.datetime
ORDER BY p4.datetime DESC LIMIT 3,1
)
)
, 'n/a'
) AS max_p
-- , d.id
-- , d.datetime
-- , d.value
FROM `Data` d
ORDER BY d.id
但是......由于相关的子查询,这将成为大型集的轻度调光查询。这些子查询将依赖于以datetime
为前导列的合适索引。