Question

我正在研究C中涉及线程和互斥体的小项目。我正在研究的程序在bmp mages上应用过滤器。该项目的目标是实现一个能够处理此命令行的程序：

$ ./filter -f filter1[,filter2[,...]] -t numThreads1[,numThreads2[,...]] input-folder output-folder

其中-f是我想要应用的过滤器（“红色”，“蓝色”，“绿色”，“灰度”和“模糊”），-t是每个过滤器分配的线程数。

到目前为止一切都很好，除了模糊，我坚持数据竞赛（或者，我想是这样）。模糊过滤器的工作方式如下：

/* Add a Gaussian blur to an image using
* this 3X3 matrix as weights matrix:
*   0.0  0.2  0.0
*   0.2  0.2  0.2
*   0.0  0.2  0.0
*
* If we consider the red component in this image
* (every element has a value between 0 and 255)
*
*   1  2  5  2  0  3
*      -------
*   3 |2  5  1| 6  0       0.0*2 + 0.2*5 + 0.0*1 +
*     |       |
*   4 |3  6  2| 1  4   ->  0.2*3 + 0.2*6 + 0.2*2 +   ->  3.2
*     |       |
*   0 |4  0  3| 4  2       0.0*4 + 0.2*0 + 0.0*3
*      -------
*   9  6  5  0  3  9
* 
* The new value of the pixel (3, 4) is round(3.2) = 3.
*
* If a pixel is outside the image, we increment the central pixel weight by 0.2
* So the new value of pixel (0, 0) is:
*   0.2 * 0 + 0.2 * 9 + 0.2 * 6 + 0.2 * 9 + 0.2 * 9 = 6.6 -> 7
*/

问题是，当我使用此模糊滤镜在“棋盘”图像上运行程序时：

$ ./filter -f blur -t 8 chess.bmp chessBlur.bmp

我期待得到这个image，但我得到this（“破损”的行随机变化）

我正在使用互斥锁来锁定和解锁关键部分，但正如您所看到的那样仍然会发生数据争用。在我的过滤器上只有两个字，我一次给每个线程一行，从底部开始向上。我的filter_blur代码是：

int filter_blur(struct image *img, int nThread)
{
    int error = 0;
    int mod = img->height%nThread;
    if (mod > 0)
        mod = 1;

    pthread_t threads[nThread];
    pthread_mutex_t mutex;
    args arguments[nThread];

    struct image* img2 = (struct image*)malloc(sizeof(struct image));
    memcpy(img2,img,sizeof(struct image));

    error=pthread_mutex_init( &mutex, NULL);
    if(error!=0)
        err(error,"pthread_mutex_init");

    int i = 0;
    for (i=0; i<nThread; i++) {
        arguments[i].img2 = img2;
        arguments[i].mutex = &mutex;
    }

    int j = 0;
    for (i=0; i<(img->height)/nThread + mod; i++) {
        for (j=0; j<nThread; j++) {

            arguments[j].img = img; arguments[j].line = i*nThread + j;

            error=pthread_create(&threads[j],NULL,threadBlur,(void*)&arguments[j]);
            if(error!=0)
                err(error,"pthread_create");
        }
        for (j=0; j<nThread; j++) {
            error=pthread_join(threads[j],NULL);
            if(error!=0)
                err(error,"pthread_join");
        }
    }
    free(img2);
    return 0;
}

void* threadBlur(void* argument) {

    // unpacking arguments
    args* image = (args*)argument;
    struct image* img = image->img;
    struct image* img2 = image->img2;
    pthread_mutex_t* mutex = image->mutex;

    int error;
    int line = image->line;
    if (line < img->height) {
        int i;

        error=pthread_mutex_lock(mutex);
        if(error!=0)
            fprintf(stderr,"pthread_mutex_lock");

        for (i=0; i<img->width; i++) {
            img->pixels[line * img->width +i] = blur(img2,i,line);
        }

        error=pthread_mutex_unlock(mutex);
        if(error!=0)
            fprintf(stderr,"pthread_mutex_unlock");
    }
    pthread_exit(NULL);
}

struct pixel blur(struct image* img2, int x, int y) {
    double red = 0;
    double green = 0;
    double blue = 0;

    red=(double)img2->pixels[y * img2->width + x].r/5.0;
    green=(double)img2->pixels[y * img2->width + x].g/5.0;
    blue=(double)img2->pixels[y * img2->width + x].b/5.0;

    if (x != 0) {
        red+=(double)img2->pixels[y * img2->width + x - 1].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x - 1].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x - 1].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (x != img2->width - 1) {
        red+=(double)img2->pixels[y * img2->width + x + 1].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x + 1].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x + 1].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (y != 0) {
        red+=(double)img2->pixels[(y - 1) * img2->width + x].r/5.0;
        green+=(double)img2->pixels[(y - 1) * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[(y - 1) * img2->width + x].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (y != img2->height - 1) {
        red+=(double)img2->pixels[(y + 1) * img2->width + x].r/5.0;
        green+=(double)img2->pixels[(y + 1) * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[(y + 1) * img2->width + x].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    struct pixel pix = {(unsigned char)round(blue),(unsigned char)round(green),(unsigned char)round(red)};
    return pix;
}

编辑1：

正如@job正确猜到的那样，问题是由我的结构的memcpy引起的（结构被复制了，但结构内部的指针仍指向原始结构元素）。我也删除了互斥体（它们只是在这里，因为我认为他们可以解决我的问题，对不起，我的坏）现在我的项目工作就像一个魅力（即使我们仍然可以讨论处理速度，以及使用线程的需要）。正如我所说，这是一个项目，是我C课程的大学项目。目标是并行化我们的过滤器。所以需要线程。

谢谢！

Answer 1

好吧，这不是关于你的代码的一些观察结果的答案：

您似乎无法从程序中的任何位置实际访问多个线程中的某个特定内存单元。所以似乎不需要突变。
或者，也许线程可以访问相同的内存段。在这种情况下，只需一个线程执行所有计算，您的程序似乎更有效率。您应该对此案例进行基准测试，并将其与线程版本进行比较。
至少对我来说，没有明显的理由说明为什么需要多线程。如果你在单个线程中进行那些浮点计算，它们很可能在操作系统甚至设法产生第二个线程之前完成。与线程创建开销时间相比，工作量无关紧要。
您当前的多线程设计存在缺陷，所有工作都在互斥锁保护代码中进行。没有实际的工作可以在互斥锁之外完成，所以无论你创建1000个线程，一次只能执行1个，其他人都会等待轮到他们。

Answer 2

首先，非常感谢你的帮助！由于你的答案，我设法修复了我的代码： - ）

由于相当数量的评论指出了我的互斥体的无用性，我还认为它们更像是我的程序性能的瓶颈而不是我的问题的解决方案。我只添加了它们，因为我希望它们能够神奇地解决我的问题（奇迹有时会在编程中发生）。现在他们走了（他们应该永远不会来），而且代码更快！

回到原来的问题！对于我的模糊滤镜的应用，我需要一个只读我的图像副本。为了获得这个副本，我使用了memcpy，如下所示：

struct image* img2 = (struct image*)malloc(sizeof(struct image));
memcpy(img2,img,sizeof(struct image));

但是当@jop指出这一点时，即使我正在复制img，指针pixels到复制img2内的已分配内存仍然指向原始数组。因此，复制img不是复制img->pixels，而是复制struct pixel* pixels = (struct pixel*)malloc(sizeof(struct pixel)*img->width*img->height); memcpy(pixels,img->pixels,sizeof(struct pixel)*img->width*img->height);。我修改了我的代码：

{{1}}

瞧，问题解决了！谢谢大家！

一些评论还讨论了使用或不使用线程的需要。好吧，在这种情况下，我没有任何选择，因为该项目的目标是编写一些并行化的图像过滤器。所以是的，线程需要！

使用互斥锁时数据竞争

2 个答案: