我正在尝试进行查询,我想忽略结果查询的第一行和最后一行。为了做到这一点,使用窗口函数给出了一个命中,就像上面的查询给我的那样
SELECT lag(timestamp_min) OVER (ORDER BY timestamp_min) AS timestamp_min,
lag(type) OVER (ORDER BY timestamp_min) AS type,
lag(sum_first_medium) OVER (ORDER BY timestamp_min),
FROM (SELECT to_timestamp(
floor(
(extract('epoch' FROM TIMESTAMP) / 300)
) * 300
) AS timestamp_min,
type,
floor(sum(medium[1])) AS sum_first_medium
FROM default_dataset
WHERE type = 'ap_clients.wlan0'
AND timestamp > current_timestamp - INTERVAL '85 minutes'
AND organization_id = '9fc02db4-c3df-4890-93ac-8dd575ca5638'
GROUP BY timestamp_min, type) lagme
OFFSET 2;
问题是最后一个查询没有返回任何内容:
ws_controller_hist=> SELECT lag(timestamp_min) OVER (ORDER BY timestamp_min) AS timestamp_min, lag(type) OVER (ORDER BY timestamp_min) AS type, lag(sum_first_medium) OVER (ORDER BY timestamp_min) FROM (SELECT to_timestamp(floor((extract('epoch' FROM TIMESTAMP) / 300)) * 300) AS timestamp_min, type, floor(sum(medium[1])) AS sum_first_medium FROM default_dataset WHERE type = 'ap_clients.wlan0' AND timestamp > current_timestamp - INTERVAL '85 minutes' AND organization_id = '9fc02db4-c3df-4890-93ac-8dd575ca5638' GROUP BY timestamp_min, type) lagme OFFSET 2;
timestamp_min | type | lag
---------------+------+-----
(0 rows)
但我有“ap_clients.wlan0”类型的数据
ws_controller_hist=> select * from default_dataset where type ='ap_clients.wlan0' order by timestamp desc limit 3;
id | timestamp | agregation_period | medium | maximum | minimum | sum | type | device_id | network_id | organiza
tion_id | labels
--------------------------------------+------------------------+-------------------+--------+---------+---------+-----+------------------+--------------------------------------+------------+-------------------
-------------------+----------------
b3661dca-a459-43cd-a3c4-7609e36c18d5 | 2018-01-02 10:21:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
abbca52d-f3f5-4a99-bd2f-41602964506e | 2018-01-02 10:16:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
24e00926-bc6d-4025-8a6c-a8de9efacdad | 2018-01-02 10:11:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
(3 rows)
我需要查询检索过去一小时内所有媒体的总和,分组为5分钟。
我的第一个解决我的问题的方法是忽略我使用offset(1)的第一条记录并忽略最后一条我试图在我的id字段中进行限制,按时间戳desc排序。
ws_controller_hist=>
SELECT to_timestamp(floor((extract('epoch' FROM TIMESTAMP) / 300)) * 300)
AS timestamp_min,
TYPE,
floor(sum(medium[1]))
FROM default_dataset
WHERE TYPE LIKE 'ap_clients.wlan0'
AND TIMESTAMP > CURRENT_TIMESTAMP - interval '85 minutes'
AND organization_id = '9fc02db4-c3df-4890-93ac-8dd575ca5638'
AND id NOT IN
(SELECT id
FROM default_dataset
ORDER BY TIMESTAMP DESC
LIMIT 1)
GROUP BY timestamp_min,
TYPE
ORDER BY timestamp_min ASC
OFFSET 1;
timestamp_min | type | floor
------------------------+------------------+-------
2017-12-19 14:20:00+00 | ap_clients.wlan0 | 38
2017-12-19 14:25:00+00 | ap_clients.wlan0 | 37
2017-12-19 14:30:00+00 | ap_clients.wlan0 | 39
2017-12-19 14:35:00+00 | ap_clients.wlan0 | 42
2017-12-19 14:40:00+00 | ap_clients.wlan0 | 43
2017-12-19 14:45:00+00 | ap_clients.wlan0 | 44
2017-12-19 14:50:00+00 | ap_clients.wlan0 | 45
2017-12-19 14:55:00+00 | ap_clients.wlan0 | 45
2017-12-19 15:00:00+00 | ap_clients.wlan0 | 43
2017-12-19 15:05:00+00 | ap_clients.wlan0 | 43
2017-12-19 15:10:00+00 | ap_clients.wlan0 | 50
2017-12-19 15:15:00+00 | ap_clients.wlan0 | 52
2017-12-19 15:20:00+00 | ap_clients.wlan0 | 50
2017-12-19 15:25:00+00 | ap_clients.wlan0 | 53
2017-12-19 15:30:00+00 | ap_clients.wlan0 | 49
2017-12-19 15:35:00+00 | ap_clients.wlan0 | 39
2017-12-19 15:40:00+00 | ap_clients.wlan0 | 16
但是我的上一个查询并没有忽略最后一条记录,因为我有相同的记录,不使用子查询“而id不在(按时间戳desc限制1从default_dataset顺序中选择id)”。
如果我尝试查询以查看“ap_clients.wlan0”类型的结果,我有
ws_controller_hist=> select * from default_dataset where organization_id='ce4b69af-bdce-4f1b-ba71-dd03544205d5' and type ='ap_clients.wlan0' order by timestamp desc limit 5;
id | timestamp | agregation_period | medium | maximum | minimum | sum | type | device_id | network_id | organiza
tion_id | labels
--------------------------------------+------------------------+-------------------+--------+---------+---------+-----+------------------+--------------------------------------+------------+-------------------
-------------------+----------------
b3661dca-a459-43cd-a3c4-7609e36c18d5 | 2018-01-02 10:21:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
abbca52d-f3f5-4a99-bd2f-41602964506e | 2018-01-02 10:16:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
24e00926-bc6d-4025-8a6c-a8de9efacdad | 2018-01-02 10:11:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
e67baf28-6d5b-43a5-85e2-fcf2d04a0b2e | 2018-01-02 10:06:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
c7ce16ce-9cda-423f-b32b-f4d6dce859e6 | 2018-01-02 10:01:08+00 | 300 | {0} | {0} | {0} | {0} | ap_clients.wlan0 | 9f3f6261-a2c3-45cd-9dc4-f9523ace0b50 | | ce4b69af-bdce-4f1b
-ba71-dd03544205d5 | {time,clients}
我该怎么办?
答案 0 :(得分:1)
一个简单的解决方案是使用lag
和lead
窗口函数,其参数不能为NULL
,这样lag
将返回NULL
第一行和lead
将为最后一行返回NULL
,因此您可以对两者都为NOT NULL
的行进行简单过滤:
SELECT
t2.timestamp_min,
t2.type,
t2.sum_first_medium
FROM (
SELECT
t1.*,
lead(1) OVER(ORDER BY t1.timestamp_min) AS lead,
lag(1) OVER(ORDER BY t1.timestamp_min) AS lag
FROM (
SELECT
to_timestamp(
floor(
(extract('epoch' FROM TIMESTAMP) / 300)
) * 300
) AS timestamp_min,
type,
floor(sum(medium[1])) AS sum_first_medium
FROM default_dataset
WHERE
type = 'ap_clients.wlan0'
AND timestamp > current_timestamp - INTERVAL '85 minutes'
AND organization_id = '9fc02db4-c3df-4890-93ac-8dd575ca5638'
GROUP BY timestamp_min, type
) t1
) t2
WHERE
t2.lag IS NOT NULL -- Only first row will return NULL, skip it
AND t2.lead IS NOT NULL -- Only last row will return NULL, skip it
ORDER BY t2.timestamp_min
注意我使用lead(1)
和lag(1)
只是因为1
是一个非NULL表达式,你可以使用任何非NULL表达式甚至是一个列(因为保证是NOT NULL
)。
另一种可能的解决方案是应用两个row_number()
调用,一个使用ORDER BY timestamp_min ASC
,另一个使用ORDER BY timestamp_min DESC
,然后过滤那些<> 1
的行。但这需要两种类型的数据集(一个用于ASC
,一个用于DESC
),而lag/lead
解决方案只需要一个(尽管可能更难理解)。