如何在点之间的最大距离内找到最低和最大值(SQL)

时间:2017-02-02 16:14:51

标签: postgresql

目前,我有一个PostgreSQL数据库(和几乎相同结构的SQL Server数据库),还有一些数据,如下例所示:

+----+---------+-----+
| ID | Name    | Val |
+----+---------+-----+
| 01 | Point A |   0 |
| 02 | Point B | 050 |
| 03 | Point C | 075 |
| 04 | Point D | 100 |
| 05 | Point E | 200 |
| 06 | Point F | 220 |
| 07 | Point G | 310 |
| 08 | Point H | 350 |
| 09 | Point I | 420 |
| 10 | Point J | 550 |
+----+---------+-----+

ID = PK (auto increment);
Name = unique;
Val = unique;

现在,假设我只有 Point F (220) ,我想找到数据之间最大距离小于100的最低值和最大值。

所以,我的结果必须返回:

  • 最低:点E(200)
  • 最大:第一点(420)

逐步解释(因为英语不是我的主要语言):

  • 寻找最低价值:

    Initial value = Point F (220);
    Look for the lower closest value of Point F (220): Point E (200);
    200(E) < 220(F) = True; 220(F) - 200(E) < 100 = True;
    Lowest value until now = Point E (200)
    
    Repeat
    
    Look for the lower closest value of Point E (200): Point D (100);
    100(D) < 200(E) = True; 200(E) - 100(D) < 100 = False;
    Lowest value = Point E (200); Break;
    
  • 寻找最大的价值:

    Initial value = Point F (220);
    Look for the biggest closest value of Point F (220): Point G (310);
    310(G) > 220(F) = True; 310(G) - 220(F) < 100 = True;
    Biggest value until now = Point G (310)
    
    Repeat
    
    Look for the biggest closest value of Point G (310): Point H (350);
    350(H) > 310(G) = True; 350(H) - 310(G) < 100 = True;
    Biggest value until now = Point H (350)
    
    Repeat
    
    Look for the biggest closest value of Point H (350): Point I (420);
    420(I) > 350(H) = True; 420(I) - 350(H) < 100 = True;
    Biggest value until now = Point I (420)
    
    Repeat
    
    Look for the biggest closest value of Point I (420): Point J (550);
    550(J) > 420(I) = True; 550(J) - 420(I) < 100 = False;
    Biggest value Point I (420); Break;
    

1 个答案:

答案 0 :(得分:2)

这可以使用Windows Functions完成,有些工作。

按照一步一步的方式,你可以从这个选择定义的一个表(我们称之为point_and_prev_next)开始:

SELECT
    id, name, val, 
    lag(val) OVER(ORDER BY id) AS prev_val, 
    lead(val) OVER(ORDER BY id) AS next_val
FROM
    points 

产生:

| id |    name | val | prev_val | next_val |
|----|---------|-----|----------|----------|
|  1 | Point A |   0 |   (null) |       50 |
|  2 | Point B |  50 |        0 |       75 |
|  3 | Point C |  75 |       50 |      100 |
|  4 | Point D | 100 |       75 |      200 |
|  5 | Point E | 200 |      100 |      220 |
|  6 | Point F | 220 |      200 |      310 |
|  7 | Point G | 310 |      220 |      350 |
|  8 | Point H | 350 |      310 |      420 |
|  9 | Point I | 420 |      350 |      550 |
| 10 | Point J | 550 |      420 |   (null) |

lag and lead窗口函数用于从表中获取上一个和下一个值(按id排序,而不是按任何分区)。

接下来,我们制作第二个表格point_and_dist_prev_next,使用valprev_valnext_val来计算到上一个点的距离和到下一个点的距离。这将使用以下SELECT计算:

SELECT
    id, name, val, (val-prev_val) AS dist_to_prev, (next_val-val) AS dist_to_next
FROM
    point_and_prev_next

这是执行后的结果:

| id |    name | val | dist_to_prev | dist_to_next |
|----|---------|-----|--------------|--------------|
|  1 | Point A |   0 |       (null) |           50 |
|  2 | Point B |  50 |           50 |           25 |
|  3 | Point C |  75 |           25 |           25 |
|  4 | Point D | 100 |           25 |          100 |
|  5 | Point E | 200 |          100 |           20 |
|  6 | Point F | 220 |           20 |           90 |
|  7 | Point G | 310 |           90 |           40 |
|  8 | Point H | 350 |           40 |           70 |
|  9 | Point I | 420 |           70 |          130 |
| 10 | Point J | 550 |          130 |       (null) |

而且,此时,(从点“F”开始),我们可以通过以下查询获得第一个“错误点”(第一个未通过“距离前一个”&lt; 100) :

SELECT
      max(id) AS first_wrong_up
FROM
    point_and_dist_prev_next
WHERE
    dist_to_prev >= 100
    AND id <= 6     -- 6 = Point F

这只是寻找最接近我们的参考点(“F”)的点,其中FAILS与前一个点的距离<1。 100。

结果是:

| first_wrong_up |
|----------------|
|              5 |

下降的第一个“错误点”以相同的方式计算。

所有这些查询都可以使用Common Table Expressions(也称为WITH查询)整理在一起,您就可以得到:

WITH point_and_dist_prev_next AS
(
    SELECT
        id, name, val, 
        val - lag(val) OVER(ORDER BY id) AS dist_to_prev, 
        lead(val) OVER(ORDER BY id)- val AS dist_to_next
    FROM
        points 
),
first_wrong_up AS
(
SELECT
    max(id) AS first_wrong_up
FROM
    point_and_dist_prev_next
WHERE
    dist_to_prev >= 100
    AND id <= 6     -- 6 = Point F
),
first_wrong_down AS
(
SELECT
    min(id) AS first_wrong_down
FROM
    point_and_dist_prev_next
WHERE
    dist_to_next >= 100
    AND id >= 6     -- 6 = Point F
)
SELECT
    (SELECT name AS "lowest value"
       FROM first_wrong_up
       JOIN points ON id = first_wrong_up),
    (SELECT name AS "biggest value"
       FROM first_wrong_down
       JOIN points ON id = first_wrong_down) ;

其中提供了以下结果:

| lowest value | biggest value |
|--------------|---------------|
|      Point E |       Point I |

您可以在SQLFiddle处查看。

注意:假设id列始终在增加。如果不是,则必须使用val列(显然,假设它始终保持增长)。