Question

我正在编写一个方法，该方法将点数组作为输入，并为数组中的每个点找到除其自身之外的最接近点。我目前正在以蛮力的方式做这件事（每隔一点与每一点交叉）。我当前的implimentation没有对数组进行排序，但我可以使用CompareByX方法按p.x值对其进行排序。我正在考虑算法的运行时间，并且使用大的n值会非常耗时。我对这个主题不是很了解，并且对于不同类型的数据结构非常了解，任何简单的帮助都会很棒！

我目前的代码是：

import java.util.*;
import java.lang.*;
import java.io.*;

class My2dPoint {
  double x;
  double y;

  public My2dPoint(double x1, double y1) {
    x=x1;
    y=y1;
  }

}


class CompareByX implements Comparator<My2dPoint> {
    public int compare(My2dPoint p1, My2dPoint p2) {
    if (p1.x < p2.x) return -1;
        if (p1.x == p2.x) return 0;
        return 1;
    }
}

    /* An object of the above comparator class is used by java.util.Arrays.sort() in main to sort an array of points by x-coordinates */

class Auxiliaries {

    public static double distSquared(My2dPoint p1, My2dPoint p2) {
        double result;
        result = (p1.x-p2.x)*(p1.x-p2.x) + (p1.y-p2.y)*(p1.y-p2.y);
        return result;
    }

}

public class HW3 {
    public static void main (String argv []) throws IOException {
        int range = 1000000; // Range of x and y coordinates in points

        System.out.println("Enter the number of points");

        InputStreamReader reader1 = new InputStreamReader(System.in);
        BufferedReader buffer1 = new BufferedReader(reader1);
        String npoints = buffer1.readLine();
        int numpoints = Integer.parseInt(npoints);

        // numpoints is now the number of points we wish to generate

        My2dPoint inputpoints [] = new My2dPoint [numpoints];

        // array to hold points

        int closest [] = new int [numpoints];

        // array to record soln; closest[i] is index of point closest to i'th

        int px, py;
        double dx, dy, dist;
        int i,j;
        double currbest;
        int closestPointIndex;
        long tStart, tEnd;

        for (i = 0; i < numpoints; i++) {

          px = (int) ( range * Math.random());
          dx = (double) px;
          py = (int) (range * Math.random());
          dy = (double) py;
          inputpoints[i] = new My2dPoint(dx, dy);

        }

        // array inputpoints has now been filled



        tStart = System.currentTimeMillis();

        // find closest [0]


        closest[0] = 1;
        currbest = Auxiliaries.distSquared(inputpoints[0],inputpoints[1]);
        for (j = 2; j < numpoints; j++) {
           dist = Auxiliaries.distSquared(inputpoints[0],inputpoints[j]);
           if (dist < currbest) {
               closest[0] = j;
               currbest = dist;
           }
        }

        // now find closest[i] for every other i 

        for (i = 1; i < numpoints; i++) {
            closest[i] = 0;
            currbest = Auxiliaries.distSquared(inputpoints[i],inputpoints[0]);
            for (j = 1; j < i; j++) {
              dist = Auxiliaries.distSquared(inputpoints[i],inputpoints[j]);
              if (dist < currbest) {
               closest[i] = j;
               currbest = dist;
          }
            }

            for (j = i+1; j < numpoints; j++) {
              dist = Auxiliaries.distSquared(inputpoints[i],inputpoints[j]);
              if (dist < currbest) {
          closest[i] = j;
                  currbest = dist;
          }
            }
        }

        tEnd = System.currentTimeMillis();
        System.out.println("Time taken in Milliseconds: " + (tEnd - tStart));
    }
}

Answer 1

最近邻搜索的蛮力只适用于少数几个点。

您可能希望一般地研究kd-Trees或空间数据结构。

Here is a demo for kd-Tree. This is what wikipedia says.

Answer 2

我肯定会先按x排序。然后我会使用点之间的x距离作为快速拒绝测试：一旦你有一个邻居的距离，任何更近的邻居必须更接近x。这避免了对x范围之外的点的所有distSquared计算。每当你找到一个更近的邻居时，你也会收紧你需要搜索的x的范围。

另外，如果P2是P1的最近邻居，那么我会使用P1作为P2最近邻居的初始猜测。

编辑：第二个想法，我会根据最大范围的任何维度进行排序。

Answer 3

有一些相当标准的方法可以改善这种搜索，你想要获得的复杂程度取决于你搜索的点数。

一个相当常见的简单方法是按X或Y对点进行排序。然后，对于每个点，您可以查找近点，在阵列中向前和向后。记住你找到的最近的一个距离是多远，当X（或Y）的差异大于你所知道的差异时，就不会有任何更近的点可以找到。

您还可以使用树对空间进行分区。维基百科有a page that gives some possible algorithms。有时，设置它们的成本比您节省的成本要大。根据您搜索的点数，您必须决定这一点。

Answer 4

使用kd树，或使用好的库进行最近邻搜索。 Weka包括一个。

Answer 5

另一种比创建kd树更简单的可能性是使用邻域矩阵。

首先将所有点放入2D方阵中。然后，您可以运行完整或部分空间排序，因此点将在矩阵内排序。

具有小Y的点可以移动到矩阵的顶行，同样，具有大Y的点将移动到底行。具有小X坐标的点也会发生同样的情况，这些点应移动到左侧的列。对称地，具有大X值的点将转到右列。

在进行空间排序后（有许多方法可以通过串行或并行算法实现这一点），您可以通过访问点P实际存储在其中的相邻单元格来查找给定点P的最近点。邻域矩阵。

您可以在下面的论文中阅读有关此想法的更多详细信息（您可以在线找到它的PDF副本）：基于紧急行为的GPU上的超大规模人群模拟。

排序步骤为您提供有趣的选择。您可以使用本文中描述的奇偶换位排序，这种排序非常简单（甚至可能在CUDA中）。如果你只运行一次，它会给你一个局部排序，如果矩阵接近排序，这可能已经很有用了。也就是说，如果你的点移动缓慢，它将为你节省大量的计算。

如果你需要一个完整的排序，你可以多次运行这种奇偶换位传递（如下面的维基百科页面所述）：

http://en.wikipedia.org/wiki/Odd%E2%80%93even_sort

如果变化很小，一次或两次偶数次传递就足以让数组再次排序。

Answer 6

如果您的点相对靠近，则可以按与某个点的距离排序（我认为它可以是任何点，但是如果该点是点，则可能必须是所有点都在同一象限中的点视为来源）。

可以说兴趣点是点A，距离D。

从排序列表中的点A选取距离n个索引相对较小的最近点（使用较大的n可能会提供更好的初始猜测，但会花费更长的时间）。如果该点到点A的直线距离为g，则您知道最接近的点到点A的最大距离为g。这样，您只需考虑列表中Dg与D + g之间的距离的点。

绘制图表可能有助于理解它。如果有人在意，我会添加一个图表。

找到每个点的最近点（最近邻）

6 个答案: