Question

让我们从一些代码开始：

QByteArray OpenGLWidget::modifyImage(QByteArray imageArray, const int width, const int height){
    if (vertFlip){
        /* Each pixel constist of four unisgned chars: Red Green Blue Alpha.
         * The field is normally 640*480, this means that the whole picture is in fact 640*4 uChars wide.
         * The whole ByteArray is onedimensional, this means that 640*4 is the red of the first pixel of the second row
         * This function is EXTREMELY SLOW
         */
        QByteArray tempArray = imageArray;
        for (int h = 0; h < height; ++h){
            for (int w = 0; w < width/2; ++w){
                for (int i = 0; i < 4; ++i){
                    imageArray.data()[h*width*4 + 4*w + i] = tempArray.data()[h*width*4 + (4*width - 4*w) + i ];
                    imageArray.data()[h*width*4 + (4*width - 4*w) + i] = tempArray.data()[h*width*4 + 4*w + i];
                }
            }
        }
    }
    return imageArray;
}

这是我现在使用的代码，用于垂直翻转640 * 480的图像（图像实际上不保证是640 * 480，但它主要是）。颜色编码为RGBA，这意味着总阵列大小为640 * 480 * 4。我得到了30 FPS的图像，我想用相同的FPS在屏幕上显示它们。

在较旧的CPU（Athlon x2）上，这个代码太多了：CPU正在努力跟上30 FPS，所以问题是：我能更有效地做到这一点吗？

我也在使用OpenGL，那是否有一个我不知道的gimmic可以翻转相对较低的CPU / GPU使用率的图像？

Answer 1

根据this question，您可以通过(1,-1,1)缩放图像来翻转OpenGL中的图像。 This question解释了如何进行转换和缩放。

Answer 2

至少可以通过块方式进行改进，利用缓存架构。在您的示例中，其中一个访问（读取或写入）将是非缓存。

Answer 3

首先，它可以帮助捕获扫描线＆＃34;如果您使用两个循环来遍历图像的像素，如下所示：

for (int y = 0; y < height; ++y)
{
    // Capture scanline.
    char* scanline = imageArray.data() + y*width*4;

    for (int x = 0; x < width/2; ++x)
    {
        const int flipped_x = width - x-1;
        for (int i = 0; i < 4; ++i)
            swap(scanline[x*4 + i], scanline[flipped_x*4 + i]);
    }
}

需要注意的另一件事是我使用swap而不是临时图像。由于您可以使用寄存器进行交换而不是从整个图像的副本加载像素，因此这样做会更有效率。

但是如果您使用32位整数而不是一次处理一个字节，如果您将要执行此类操作，通常也会有所帮助。如果您正在使用8位类型的像素，但知道每个像素都是32位，例如，在您的情况下，您通常可以将案例转移到uint32_t*，例如

for (int y = 0; y < height; ++y)
{
    uint32_t* scanline = (uint32_t*)imageArray.data() + y*width;
    std::reverse(scanline, scanline + width);
}

此时你可能会对y循环进行分析。水平翻转图像（应该是＃34;水平＆＃34;如果我正确理解你的原始代码）这样对访问模式有点棘手，但你应该能够得到相当大的提升以上技术。

我也在使用OpenGL，这有一个我不知道的噱头那可以翻转相对较低的CPU / GPU使用率的图像吗？

自然地，翻转图像的最快方法是根本不触摸它们的像素，只是在渲染结果时保存管道最后部分的翻转。为此，您可以使用负缩放在OGL中渲染纹理，而不是修改纹理的像素。

在视频和图像处理中真正有用的另一件事是表示要为所有图像操作处理的图像：

struct Image32
{
     uint32_t* pixels;
     int32_t width;
     int32_t height;
     int32_t x_stride;
     int32_t y_stride;
};

步幅字段是用于从图像的一个扫描线（行）到下一个垂直方向以及一个水平到下一个列的字段。使用此表示时，可以对步幅使用负值并相应地偏移像素。您还可以使用步幅字段，例如，使用y_stride=height*2和height/=2仅渲染图像的每个其他扫描线，以进行快速交互式半分辨率扫描线预览。您可以通过将x步幅设置为2并将y步幅设置为2 *宽度然后将宽度和高度减半来对图像进行四分之一分割。你可以通过修改这些字段并保持y步幅宽度从图像的裁剪部分的一行到下一行来渲染裁剪图像而不使你的blit函数接受一大堆参数：

// Using the stride representation of Image32, this can now
// blit a cropped source, a horizontally flipped source, 
// a vertically flipped source, a source flipped both ways,
// a half-res source, a quarter-res source, a quarter-res
// source that is horizontally flipped and cropped, etc,
// and all without modifying the source image in advance
// or having to accept all kinds of extra drawing parameters.
void blit(int dst_x, int dst_y, Image32 dst, Image32 src);

// We don't have to do things like this (and I think I lost
// some capabilities with this version below but it hurts my 
// brain too much to think about what capabilities were lost):
void blit_gross(int dst_x, int dst_y, int dst_w, int dst_h, uint32_t* dst, 
                int src_x, int src_y, int src_w, int src_h, 
                const uint32_t* src, bool flip_x, bool flip_y);

通过使用负值并将其传递给图像操作（例如：blit操作），结果自然会被翻转而不必实际翻转图像。它可能会被翻转为＃34;，可以这么说，就像使用带有负缩放变换矩阵的OGL一样。

垂直翻转Char数组：有更有效的方法吗？

3 个答案: