如何有效地找到一组点的边界框?

时间:2017-09-21 04:26:55

标签: python profiling bounding-box

我有几个点存储在一个数组中。我需要找到那些点的界限,即。限定所有点的矩形。我知道如何用普通的Python解决这个问题。

我想知道有没有比天真的最大,最小的数组或内置方法更好的方法来解决问题。

points = [[1, 3], [2, 4], [4, 1], [3, 3], [1, 6]]
b = bounds(points) # the function I am looking for
# now b = [[1, 1], [4, 6]]

2 个答案:

答案 0 :(得分:9)

我获得绩效的方法是尽可能将事情降低到 C 级别:

def bounding_box(points):
    x_coordinates, y_coordinates = zip(*points)

    return [(min(x_coordinates), min(y_coordinates)), (max(x_coordinates), max(y_coordinates))]

通过我的(粗略)测量,这比@ ReblochonMasque的bounding_box_naive()快约1.5倍。而且显然更优雅。 ; - )

答案 1 :(得分:1)

您的效果不能超过O(n),因为您必须遍历所有点才能确定maxmin的{​​{1}}和x

但是,你可以减少常数因子,并且只遍历列表一次;然而,目前还不清楚这是否会给你一个更好的执行时间,如果确实如此,那将是大量积分。

  

[编辑]:事实上它没有,“天真”的方法是最有效的。

这是“天真”的方法:(这是两者中最快的)

y

和(也许?)不太天真:

def bounding_box_naive(points):
    """returns a list containing the bottom left and the top right 
    points in the sequence
    Here, we use min and max four times over the collection of points
    """
    bot_left_x = min(point[0] for point in points)
    bot_left_y = min(point[1] for point in points)
    top_right_x = max(point[0] for point in points)
    top_right_y = max(point[1] for point in points)

    return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]

分析结果:

def bounding_box(points):
    """returns a list containing the bottom left and the top right 
    points in the sequence
    Here, we traverse the collection of points only once, 
    to find the min and max for x and y
    """
    bot_left_x, bot_left_y = float('inf'), float('inf')
    top_right_x, top_right_y = float('-inf'), float('-inf')
    for x, y in points:
        bot_left_x = min(bot_left_x, x)
        bot_left_y = min(bot_left_y, y)
        top_right_x = max(top_right_x, x)
        top_right_y = max(top_right_y, y)

    return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]

尺寸= 1,000点

import random
points = [(random.randrange(-1000, 1000), random.randrange(-1000, 1000))  for _ in range(1000000)]

%timeit bounding_box_naive(points)
%timeit bounding_box(points)

size = 10,000 points

1000 loops, best of 3: 573 µs per loop
1000 loops, best of 3: 1.46 ms per loop

大小100,000点

100 loops, best of 3: 5.7 ms per loop
100 loops, best of 3: 14.7 ms per loop

大小1,000,000点

10 loops, best of 3: 66.8 ms per loop
10 loops, best of 3: 141 ms per loop

显然,第一个“不太天真”的方法更快一个因子1 loop, best of 3: 664 ms per loop 1 loop, best of 3: 1.47 s per loop