Question

我目前正在阅读17张图片（24位，1200 x 1600）。读取17张图片花费了我大约.078秒然而我想将这个大小为5760000的内存块转换成192000的黑白图像大小来做我的laplacian edge_detection。现在我使用以下方法：

images.resize(rows * cols);
images.reserve(rows * cols);

for(int z = 0; z < rows * cols; z ++){
    pix.blue = (int) *(pChar + z * 3);
    pix.green = (int) *(pChar + z * 3 + 1);
    pix.red = (int) *(pChar + z * 3 + 2);
    pix.black_white = pix.blue * .11 + pix.green * .59 + pix.red *.3;
    images.at(z).black_white = pix.blue * .11 + pix.green * .59 + pix.red *.3;
}

然而，读取pChar存储器块并写入矢量大小为1920000的这个过程花费了我总共2.262秒读取17个图像的时间。有没有更快的方法来接近这个？

我尝试过使用下面不同的代码，但是pChar2继续告诉我它在VS2010的调试模式下有一个badptr :( data_grey，pChar，pChar2变量是一个无符号字符*）

pChar = (unsigned char*) malloc (sizeof(char)*3*rows*cols);
pChar2 = (unsigned char*) malloc (sizeof(char) * rows * cols);
fread(pChar, sizeof(char), 3*rows*cols, bmpInput);
images.at(i).data = pChar;

for(int z = 0; z < rows * cols; z ++){
    pix.blue = (int) *(pChar + z * 3);
    pix.green = (int) *(pChar + z * 3 + 1);
    pix.red = (int) *(pChar + z * 3 + 2);
    pix.black_white = pix.blue * .11 + pix.green * .59 + pix.red *.3;
    pChar2 += (unsigned char) pix.black_white;
}
    images.at(i).data_grey = pChar2;

我的想法是我可能以错误的方式写入pChar2内存块。但是这第二种方法要快得多，所以我想知道如何解决它。如果我为images.at（i）.data_grey获得了一块黑白的内存块，那将是理想的选择。我主要是想这样做，因为它比矢量快得多但是我在基于矢量的代码上做错了，这使得它比较慢吗？（我个人觉得矢量更容易使用但如果我需要速度非常糟糕所以我会处理使用内存块，如果它应该更快）

Answer 1

我认为你需要做的就是改变

pChar2 += (unsigned char) pix.black_white;

到

pChar2[z] = (unsigned char) pix.black_white;

假设你正在尝试我想你想做的事情（指向pChar2的内存块分配值，然后将指针移动到下一个8位内存块？）

Answer 2

不要使用bounds-checking元素访问器at()。这将每次检查您的索引，因为如果您使用越界索引它必须抛出异常。这应该永远不会发生，因为你最初调整了矢量大小。

因此，您应该使用非边界检查运算符[]。

您也可以像C数组一样直接写入矢量。一些纯粹主义者可能对此感到不安，但我认为可以使用向量：

images.resize(rows*cols);
unsigned char *bwPixel = &images[0];

for( ... )
{
    // ...
    *bwPixel++ = (unsigned char) (pix.blue * .11 + pix.green * .59 + pix.red *.3);
}

提高程序速度：矢量速度，内存块速度

2 个答案: