我的目标是创建一个程序,通过增加程序可以使用的线程数来评估性能提升。我通过使用蒙特卡罗方法计算pi来评估性能。每个线程应创建1个随机坐标(x,y)并检查该坐标是否在圆圈内。如果是,则inCircle
计数器应该增加。 Pi的计算方法如下:4 * inCircle/trys
。使用pthread_join
,在应该受益于多个线程的问题中没有性能提升。有没有办法让多个线程增加一个计数器而不必等待每个单独的线程?
#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <stdbool.h>
#include <pthread.h>
#define nPoints 10000000
#define NUM_THREADS 16
int inCircle = 0;
int count = 0;
double x,y;
pthread_mutex_t mutex;
bool isInCircle(double x, double y){
if(x*x+y*y<=1){
return true;
}
else{
return false;
}
}
void *piSlave(){
int myCount = 0;
time_t now;
time(&now);
srand((unsigned int)now);
for(int i = 1; i <= nPoints/NUM_THREADS; i++) {
x = (double)rand() / (double)RAND_MAX;
y = (double)rand() / (double)RAND_MAX;
if(isInCircle(x,y)){
myCount++;
}
}
pthread_mutex_lock(&mutex);
inCircle += myCount;
pthread_mutex_unlock(&mutex);
pthread_exit(0);
}
double piMaster()
{
pthread_t threads[NUM_THREADS];
int rc;
long t;
for(t=0; t<NUM_THREADS; t++){
printf("Creating thread %ld\n", t);
rc = pthread_create(&threads[t], NULL, piSlave, (void *)t);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
//pthread_join(threads[t], NULL);
}
//wait(NULL);
return 4.0*inCircle/nPoints;
}
int main()
{
printf("%f\n",piMaster());
return(0);
}
答案 0 :(得分:0)
代码存在一些问题。
piMaster()
函数应该等待它创建的线程。我们只需在循环中运行pthread_join()
即可完成此操作:
for (t = 0; t < NUM_THREADS; t++)
pthread_join(threads[t], NULL);
我们可以简单地在循环结束时原子地增加inCircle
计数器,因此不需要锁定。必须使用Atomic operations C reference:
_Atomic
关键字声明变量
_Atomic long inCircle = 0;
void *piSlave(void *arg)
{
[...]
inCircle += myCount;
[...]
}
这将生成正确的CPU指令以原子方式增加变量。例如,对于x86
架构,我们可以在反汇编中确认lock
前缀:
29 inCircle += myCount;
0x0000000100000bdb <+155>: lock add %rbx,0x46d(%rip) # 0x100001050 <inCircle>
rand()
相反,我们可以按照Approximations of Pi维基百科页面上的描述,循环扫描整个圆圈:
for (long x = -RADIUS; x <= RADIUS; x++)
for (long y = -RADIUS; y <= RADIUS; y++)
myCount += isInCircle(x, y);
所以这是上面更改后的代码:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define RADIUS 10000L
#define NUM_THREADS 10
_Atomic long inCircle = 0;
inline long isInCircle(long x, long y)
{
return x * x + y * y <= RADIUS * RADIUS ? 1 : 0;
}
void *piSlave(void *arg)
{
long myCount = 0;
long tid = (long)arg;
for (long x = -RADIUS + tid; x <= RADIUS + tid; x += NUM_THREADS)
for (long y = -RADIUS; y <= RADIUS; y++)
myCount += isInCircle(x, y);
printf("\tthread %ld count: %zd\n", tid, myCount);
inCircle += myCount;
pthread_exit(0);
}
double piMaster()
{
pthread_t threads[NUM_THREADS];
long t;
for (t = 0; t < NUM_THREADS; t++) {
printf("Creating thread %ld...\n", t);
if (pthread_create(&threads[t], NULL, piSlave, (void *)t)) {
perror("Error creating pthread");
exit(-1);
}
}
for (t = 0; t < NUM_THREADS; t++)
pthread_join(threads[t], NULL);
return (double)inCircle / (RADIUS * RADIUS);
}
int main()
{
printf("Result: %f\n", piMaster());
return (0);
}
这是输出:
Creating thread 0...
Creating thread 1...
Creating thread 2...
Creating thread 3...
Creating thread 4...
Creating thread 5...
Creating thread 6...
Creating thread 7...
Creating thread 8...
Creating thread 9...
thread 7 count: 31415974
thread 5 count: 31416052
thread 1 count: 31415808
thread 3 count: 31415974
thread 0 count: 31415549
thread 4 count: 31416048
thread 2 count: 31415896
thread 9 count: 31415808
thread 8 count: 31415896
thread 6 count: 31416048
Result: 3.141591