目前,我有一个PostgreSQL数据库(和几乎相同结构的SQL Server数据库),还有一些数据,如下例所示:
+----+---------+-----+
| ID | Name | Val |
+----+---------+-----+
| 01 | Point A | 0 |
| 02 | Point B | 050 |
| 03 | Point C | 075 |
| 04 | Point D | 100 |
| 05 | Point E | 200 |
| 06 | Point F | 220 |
| 07 | Point G | 310 |
| 08 | Point H | 350 |
| 09 | Point I | 420 |
| 10 | Point J | 550 |
+----+---------+-----+
ID = PK (auto increment);
Name = unique;
Val = unique;
现在,假设我只有 Point F (220)
,我想找到数据之间最大距离小于100的最低值和最大值。
所以,我的结果必须返回:
逐步解释(因为英语不是我的主要语言):
寻找最低价值:
Initial value = Point F (220);
Look for the lower closest value of Point F (220): Point E (200);
200(E) < 220(F) = True; 220(F) - 200(E) < 100 = True;
Lowest value until now = Point E (200)
Repeat
Look for the lower closest value of Point E (200): Point D (100);
100(D) < 200(E) = True; 200(E) - 100(D) < 100 = False;
Lowest value = Point E (200); Break;
寻找最大的价值:
Initial value = Point F (220);
Look for the biggest closest value of Point F (220): Point G (310);
310(G) > 220(F) = True; 310(G) - 220(F) < 100 = True;
Biggest value until now = Point G (310)
Repeat
Look for the biggest closest value of Point G (310): Point H (350);
350(H) > 310(G) = True; 350(H) - 310(G) < 100 = True;
Biggest value until now = Point H (350)
Repeat
Look for the biggest closest value of Point H (350): Point I (420);
420(I) > 350(H) = True; 420(I) - 350(H) < 100 = True;
Biggest value until now = Point I (420)
Repeat
Look for the biggest closest value of Point I (420): Point J (550);
550(J) > 420(I) = True; 550(J) - 420(I) < 100 = False;
Biggest value Point I (420); Break;
答案 0 :(得分:2)
这可以使用Windows Functions完成,有些工作。
按照一步一步的方式,你可以从这个选择定义的一个表(我们称之为point_and_prev_next
)开始:
SELECT
id, name, val,
lag(val) OVER(ORDER BY id) AS prev_val,
lead(val) OVER(ORDER BY id) AS next_val
FROM
points
产生:
| id | name | val | prev_val | next_val |
|----|---------|-----|----------|----------|
| 1 | Point A | 0 | (null) | 50 |
| 2 | Point B | 50 | 0 | 75 |
| 3 | Point C | 75 | 50 | 100 |
| 4 | Point D | 100 | 75 | 200 |
| 5 | Point E | 200 | 100 | 220 |
| 6 | Point F | 220 | 200 | 310 |
| 7 | Point G | 310 | 220 | 350 |
| 8 | Point H | 350 | 310 | 420 |
| 9 | Point I | 420 | 350 | 550 |
| 10 | Point J | 550 | 420 | (null) |
lag
and lead
窗口函数用于从表中获取上一个和下一个值(按id排序,而不是按任何分区)。
接下来,我们制作第二个表格point_and_dist_prev_next
,使用val
,prev_val
和next_val
来计算到上一个点的距离和到下一个点的距离。这将使用以下SELECT计算:
SELECT
id, name, val, (val-prev_val) AS dist_to_prev, (next_val-val) AS dist_to_next
FROM
point_and_prev_next
这是执行后的结果:
| id | name | val | dist_to_prev | dist_to_next |
|----|---------|-----|--------------|--------------|
| 1 | Point A | 0 | (null) | 50 |
| 2 | Point B | 50 | 50 | 25 |
| 3 | Point C | 75 | 25 | 25 |
| 4 | Point D | 100 | 25 | 100 |
| 5 | Point E | 200 | 100 | 20 |
| 6 | Point F | 220 | 20 | 90 |
| 7 | Point G | 310 | 90 | 40 |
| 8 | Point H | 350 | 40 | 70 |
| 9 | Point I | 420 | 70 | 130 |
| 10 | Point J | 550 | 130 | (null) |
而且,此时,(从点“F”开始),我们可以通过以下查询获得第一个“错误点”(第一个未通过“距离前一个”&lt; 100) :
SELECT
max(id) AS first_wrong_up
FROM
point_and_dist_prev_next
WHERE
dist_to_prev >= 100
AND id <= 6 -- 6 = Point F
这只是寻找最接近我们的参考点(“F”)的点,其中FAILS与前一个点的距离<1。 100。
结果是:
| first_wrong_up |
|----------------|
| 5 |
下降的第一个“错误点”以相同的方式计算。
所有这些查询都可以使用Common Table Expressions(也称为WITH
查询)整理在一起,您就可以得到:
WITH point_and_dist_prev_next AS
(
SELECT
id, name, val,
val - lag(val) OVER(ORDER BY id) AS dist_to_prev,
lead(val) OVER(ORDER BY id)- val AS dist_to_next
FROM
points
),
first_wrong_up AS
(
SELECT
max(id) AS first_wrong_up
FROM
point_and_dist_prev_next
WHERE
dist_to_prev >= 100
AND id <= 6 -- 6 = Point F
),
first_wrong_down AS
(
SELECT
min(id) AS first_wrong_down
FROM
point_and_dist_prev_next
WHERE
dist_to_next >= 100
AND id >= 6 -- 6 = Point F
)
SELECT
(SELECT name AS "lowest value"
FROM first_wrong_up
JOIN points ON id = first_wrong_up),
(SELECT name AS "biggest value"
FROM first_wrong_down
JOIN points ON id = first_wrong_down) ;
其中提供了以下结果:
| lowest value | biggest value |
|--------------|---------------|
| Point E | Point I |
您可以在SQLFiddle处查看。
注意:假设id
列始终在增加。如果不是,则必须使用val
列(显然,假设它始终保持增长)。