从线程内增加全局计数器变量,而不必等待每个单独的线程

时间:2018-02-22 17:05:16

标签: c multithreading pthreads mutex pthread-join

我的目标是创建一个程序,通过增加程序可以使用的线程数来评估性能提升。我通过使用蒙特卡罗方法计算pi来评估性能。每个线程应创建1个随机坐标(x,y)并检查该坐标是否在圆圈内。如果是,则inCircle计数器应该增加。 Pi的计算方法如下:4 * inCircle/trys。使用pthread_join,在应该受益于多个线程的问题中没有性能提升。有没有办法让多个线程增加一个计数器而不必等待每个单独的线程?

#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <stdbool.h>
#include <pthread.h>

#define nPoints 10000000
#define NUM_THREADS 16

int inCircle = 0;
int count = 0;
double x,y;
pthread_mutex_t mutex;

bool isInCircle(double x, double y){
    if(x*x+y*y<=1){
        return true;
    }
    else{
        return false;
    }
}

void *piSlave(){
    int myCount = 0;
    time_t now;
    time(&now);
    srand((unsigned int)now);
    for(int i = 1; i <= nPoints/NUM_THREADS; i++) {
        x = (double)rand() / (double)RAND_MAX;
        y = (double)rand() / (double)RAND_MAX;
        if(isInCircle(x,y)){
            myCount++;
        }
     }
    pthread_mutex_lock(&mutex);
    inCircle += myCount;
    pthread_mutex_unlock(&mutex);
    pthread_exit(0);
}
double piMaster()
{
    pthread_t threads[NUM_THREADS];
    int rc;
    long t;

    for(t=0; t<NUM_THREADS; t++){
        printf("Creating thread %ld\n", t);
        rc = pthread_create(&threads[t], NULL, piSlave, (void *)t);
        if (rc){
            printf("ERROR; return code from pthread_create() is %d\n", rc);
            exit(-1);
        }
    //pthread_join(threads[t], NULL);

    }
    //wait(NULL);
    return 4.0*inCircle/nPoints;
}

int main()
{
    printf("%f\n",piMaster());
    return(0);
}

1 个答案:

答案 0 :(得分:0)

代码存在一些问题。

等待线程终止

piMaster()函数应该等待它创建的线程。我们只需在循环中运行pthread_join()即可完成此操作:

for (t = 0; t < NUM_THREADS; t++)
    pthread_join(threads[t], NULL);

避免锁定

我们可以简单地在循环结束时原子地增加inCircle计数器,因此不需要锁定。必须使用Atomic operations C reference

中所述的_Atomic关键字声明变量
_Atomic long inCircle = 0;
void *piSlave(void *arg)
{
    [...]
    inCircle += myCount;
    [...]
}

这将生成正确的CPU指令以原子方式增加变量。例如,对于x86架构,我们可以在反汇编中确认lock前缀:

29      inCircle += myCount;
   0x0000000100000bdb <+155>:   lock add %rbx,0x46d(%rip)        # 0x100001050 <inCircle>

避免慢速和线程不安全rand()

相反,我们可以按照Approximations of Pi维基百科页面上的描述,循环扫描整个圆圈:

for (long x = -RADIUS; x <= RADIUS; x++)
    for (long y = -RADIUS; y <= RADIUS; y++)
        myCount += isInCircle(x, y);

所以这是上面更改后的代码:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

#define RADIUS 10000L
#define NUM_THREADS 10

_Atomic long inCircle = 0;

inline long isInCircle(long x, long y)
{
    return x * x + y * y <= RADIUS * RADIUS ? 1 : 0;
}

void *piSlave(void *arg)
{
    long myCount = 0;
    long tid = (long)arg;

    for (long x = -RADIUS + tid; x <= RADIUS + tid; x += NUM_THREADS)
        for (long y = -RADIUS; y <= RADIUS; y++)
            myCount += isInCircle(x, y);

    printf("\tthread %ld count: %zd\n", tid, myCount);
    inCircle += myCount;

    pthread_exit(0);
}

double piMaster()
{
    pthread_t threads[NUM_THREADS];
    long t;

    for (t = 0; t < NUM_THREADS; t++) {
        printf("Creating thread %ld...\n", t);
        if (pthread_create(&threads[t], NULL, piSlave, (void *)t)) {
            perror("Error creating pthread");
            exit(-1);
        }
    }
    for (t = 0; t < NUM_THREADS; t++)
        pthread_join(threads[t], NULL);

    return (double)inCircle / (RADIUS * RADIUS);
}

int main()
{
    printf("Result: %f\n", piMaster());
    return (0);
}

这是输出:

Creating thread 0...
Creating thread 1...
Creating thread 2...
Creating thread 3...
Creating thread 4...
Creating thread 5...
Creating thread 6...
Creating thread 7...
Creating thread 8...
Creating thread 9...
    thread 7 count: 31415974
    thread 5 count: 31416052
    thread 1 count: 31415808
    thread 3 count: 31415974
    thread 0 count: 31415549
    thread 4 count: 31416048
    thread 2 count: 31415896
    thread 9 count: 31415808
    thread 8 count: 31415896
    thread 6 count: 31416048
Result: 3.141591