Question

我注意到与STL Vector相比，我的增强型多功能表现非常糟糕。我之前问过this question，最喜欢的答案说明了

1）Boost几乎与原生数组一样快

2）您需要更改访问数据元素的顺序，以便从Boost MultiArray中获得最佳性能。此外，您需要在发布模式下运行，而不是在调试。

嗯，我做了所有这些，但我的MultiArrays的性能非常破旧。

我在这里发布我的代码：

A）与默认订购

#include <windows.h>
#define _SCL_SECURE_NO_WARNINGS
#define BOOST_DISABLE_ASSERTS 
#include <boost/multi_array.hpp>
#include <stdio.h>
#include <conio.h>
#include <iostream>

int main(int argc, char* argv[])
{
    const int X_SIZE = 400;
    const int Y_SIZE = 400;
    const int ITERATIONS = 500;
    unsigned int startTime = 0;
    unsigned int endTime = 0;

    // Create the boost array
    typedef boost::multi_array<double, 2> ImageArrayType;
    ImageArrayType boostMatrix(boost::extents[X_SIZE][Y_SIZE]);

    // Create the native array
    double *nativeMatrix = new double [X_SIZE * Y_SIZE];

    //------------------Measure boost----------------------------------------------
    startTime = ::GetTickCount();
    for (int i = 0; i < ITERATIONS; ++i)
    {
        for (int y = 0; y < Y_SIZE; ++y)
        {
            for (int x = 0; x < X_SIZE; ++x)
            {
                boostMatrix[x][y] *= 2.345;
            }
        }
    }
    endTime = ::GetTickCount();
    printf("[Boost] Elapsed time: %6.3f seconds\n", (endTime - startTime) / 1000.0);

    //------------------Measure native-----------------------------------------------
    startTime = ::GetTickCount();
    for (int i = 0; i < ITERATIONS; ++i)
    {
        for (int y = 0; y < Y_SIZE; ++y)
        {
            for (int x = 0; x < X_SIZE; ++x)
            {
                nativeMatrix[x + (y * X_SIZE)] *= 2.345;
            }
        }
    }
    endTime = ::GetTickCount();
    printf("[Native]Elapsed time: %6.3f seconds\n", (endTime - startTime) / 1000.0);

    return 0;
}

B）WITH INVERTED ORDERING

#include <windows.h>
#define _SCL_SECURE_NO_WARNINGS
#define BOOST_DISABLE_ASSERTS 
#include <boost/multi_array.hpp>
#include <stdio.h>
#include <conio.h>
#include <iostream>

int main(int argc, char* argv[])
{
    const int X_SIZE = 400;
    const int Y_SIZE = 400;
    const int ITERATIONS = 500;
    unsigned int startTime = 0;
    unsigned int endTime = 0;

    // Create the boost array
    typedef boost::multi_array<double, 2> ImageArrayType;
    ImageArrayType boostMatrix(boost::extents[X_SIZE][Y_SIZE]);

    // Create the native array
    double *nativeMatrix = new double [X_SIZE * Y_SIZE];

    //------------------Measure boost----------------------------------------------
    startTime = ::GetTickCount();
    for (int i = 0; i < ITERATIONS; ++i)
    {
        for (int x = 0; x < X_SIZE; ++x)
        {
            for (int y = 0; y < Y_SIZE; ++y)
            {
                boostMatrix[x][y] *= 2.345;
            }
        }
    }
    endTime = ::GetTickCount();
    printf("[Boost] Elapsed time: %6.3f seconds\n", (endTime - startTime) / 1000.0);

    //------------------Measure native-----------------------------------------------
    startTime = ::GetTickCount();
    for (int i = 0; i < ITERATIONS; ++i)
    {
        for (int x = 0; x < X_SIZE; ++x)
        {
            for (int y = 0; y < Y_SIZE; ++y)
            {
                nativeMatrix[x + (y * X_SIZE)] *= 2.345;
            }
        }
    }
    endTime = ::GetTickCount();
    printf("[Native]Elapsed time: %6.3f seconds\n", (endTime - startTime) / 1000.0);

    return 0;
}

在每种可能的排列中，我的基准大致相同：

1）对于本机代码：0.15s

2）对于Boost MultiArray：1.0s
我正在使用Visual Studio 2010。

我的问题是：鉴于我正在使用Visual Studio，如何从Boost MultiArrays中获得良好的性能？

更新：

我切换到Visual Studio 2013.在那里，我启用了Qvec-report2编译器开关。非常有趣的是，当我编译时，我开始收到一条信息消息，说编译器无法进行矢量化。这是一个示例信息消息，看起来几乎像一个警告。我收到了几条最简单代码的消息。

---分析函数：void __cdecl`vector构造函数迭代器＆＃39;（void * __ptr64，unsigned __int64，int，void * __ptr64（__ cdecl *）（void * __ptr64）） 1 GT; D：\ Workspace \ test \ Scrap \ Scrap \ Source.cpp：info C5002：由于原因而没有矢量化循环＆＃39; 1301＆＃39;

我认为这是一个主要的线索，为什么Boost多阵列在我的Visual Studio上执行速度较慢，而它们在GCC上表现良好。鉴于这些额外信息，您能想到解决问题的方法吗？

@Admins：请按照之前的回答取消我的问题。我做了一个重大的编辑。

Boost MultiArrays性能很差

0 个答案: