Question

我有两组积分（source和target）。我们的目标是为每个source点找到target中唯一的1：1最近邻居点。

我的尝试按预期工作，但速度非常慢。我已经测试了几千个点，但实际情况下点数将是数百万。我不是 STL 的专家。有什么建议我如何优化它？

std::vector<UT_Vector3> targetPosVector;
for (auto i = 0; i < targetNumPoints; i++)
{
    auto pos = target->getPos3(i);

    targetPosVector.push_back(pos);
}

std::vector<int> uniqueNeighborVector;
for (auto ptoff = 0; ptoff < sourceNumPoints; ptoff++)
{
    std::vector<std::pair<int, fpreal>> nearpointVector; // neighbor vector in form of "(idx, dist)"

    auto pos = source->getPos3(ptoff);
    for (auto j = 0; j < targetNumPoints; j++)
    {
        fpreal dist = pos.distance(targetPosVector[j]);

        std::pair<int, fpreal> neighbor {j, dist};
        nearpointVector.push_back(neighbor);
    }
    std::sort(nearpointVector.begin(), nearpointVector.end(), [](const std::pair<int, fpreal> &left,
                                                                 const std::pair<int, fpreal> &right)
                                                                { return left.second < right.second; });

    std::vector<int> neighborVector;
    for (auto i : nearpointVector)
    {
        neighborVector.push_back(i.first);
    }

    // trying to imitate Python's next() function
    // uniqueNeighborList[]
    // uneighbor = next(i for i in neighborVector if i not in uniqueNeighborVector)
    // uniqueNeighborVector = set(uniqueNeighborList.append(uneighbor))
    for (auto i : neighborVector)
    {     
        if (std::find(uniqueNeighborVector.begin(), uniqueNeighborVector.end(), i) == uniqueNeighborVector.end())
        {
            int uneighbor = i; // later on binds to the geometry attribute

            uniqueNeighborVector.push_back(i);

            break;
        }
    }
}

，其中：

source和target 明细几何数据
distance是计算两个向量之间距离的函数
getPos3是获取3-float vector点位置
fpreal又名64-bit float
UT_Vector3是3-float vector
sourceNumPoints和targetNumPoints是点数分别为source和target几何。

Answer 1

正如评论中所提到的，当试图为数百万点计算时，二次复杂性是一个垮台。即使您优化了代码，如果方法没有改变，二次复杂性也将保持不变。

听起来像R3中的经典NN问题。一种众所周知的方法是使用k-d trees，它们允许O（n log n）构造时间和线性空间内的O（log n）查询时间。可以通过各种库来寻求实现：nanoflann，kdtree（这些来自快速搜索，我确信还有精心制作的库包含k-d树。）

简短回顾：我会使用三维树并在目标点集上构建它。然后取每个源点（逐个）并在O（log n）时间内找到3-d树中最近的邻居，这将导致O（| source | log | target |）时间和O（|目标|）大小。

STL：对大量数据进行排序和搜索

1 个答案: