Question

在我的 main.cpp 中，我有一段摘录：

Ptr<FastFeatureDetector> fastDetector = FastFeatureDetector::create(80, true);

while (true) {
    Mat image = // get grayscale image 1280x720

    timer.start();
    detector->detect(image, keypoints);
    myfile << "FAST\t" << timer.end() << endl; // timer.end() is how many seconds elapsed since last timer.start()


    keypoints.clear();

    timer.start();
    for (int i = 3; i < image.rows - 3; i++)
    {
        for (int j = 3; j < image.cols - 3; j++)
        {
            if (inspectPoint(image.data, image.cols, i, j)) {
                // this block is never entered
                KeyPoint keypoint(i, j, 3);
                keypoints.push_back(keypoint);
            }
        }
    }
    myfile << "Custom\t" << timer.end() << endl;
    myfile << endl;
    myfile.flush();
    ...
}

myfile 说：

FAST    0.000515495
Custom  0.00221361

FAST    0.000485697
Custom  0.00217653

FAST    0.000490001
Custom  0.00219044

FAST    0.000484373
Custom  0.00216329

FAST    0.000561184
Custom  0.00233214

因此可以预期inspectPoint()是一个实际正在做某事的函数。

bool inspectPoint(const uchar* img, int cols, int i, int j) {
    uchar p = img[i * cols + j];
    uchar pt = img[(i - 3)*cols + j];
    uchar pr = img[i*cols + j + 3];
    uchar pb = img[(i + 3)*cols + j];
    uchar pl = img[i*cols + j - 3];

    return cols < pt - pr + pb - pl + i; // just random check so that the optimizer doesn't skip any calculations
}

我使用的是Visual Studio 2013，优化设置为“完全优化（/ Ox）”。

据我所知，FAST算法遍历所有像素？我想它实际上不可能比函数inspectPoint()更快地处理每个像素。

FAST探测器如此快速？或者更确切地说，为什么嵌套循环这么慢？

Answer 1

通过快速浏览源代码，看起来在fastFeatureDetector中对SSE和OpenCL进行了广泛的优化：github.com/Itseez/opencv/blob/master/modules/features2d/src/‌

SSE和OpenCL并非特定于任何CPU。 SSE利用CPU能够同时对多个数据执行单个指令（计算）。因此，根据CPU的架构，这可以将速度提高2倍或超过4倍。 OpenCL可以利用GPU，这也可以为某些图像处理操作带来重大的性能提升。

OpenCV FAST探测器

1 个答案: