Question

我是C语言中的新手，我正在编写一个简单的程序，可以并行查找多个文件中的单词。但是，每当我输入多个文件时输出会有所不同，这表明我的代码中没有修复竞争条件。你能帮我解决一下吗？

以下代码片段在main中，制作pthreads。

    int i = 0;
char *word = "Pluto"; //Word to be found

Message messages[argc-1];
pthread_t threads[argc-1];
for(i; i < argc - 1; i++){
    messages[i].file = argv[i + 1];
    messages[i].word = word;
    messages[i].fp   = fopen(argv[i + 1], "r");
    int  iret = pthread_create( &threads[i], NULL, threadFindWord, (void*) &(messages[i]));
}for(i = 0; i < argc - 1; i++){
    pthread_join(threads[i],NULL);
}

每个线程调用的函数：

Message *msg;
msg = (Message *) ptr;

int numFound = ffindWord(msg->fp, msg->word);

printf("File %s has %i occurences of the word %s\n", msg->file, numFound, msg->word);

fclose(msg->fp);
pthread_exit(NULL);

以下是在文件中查找单词的代码

int findWord(char * file, char * word){
 char * current = strtok(file, " ,.\n");
 int sum = 0;
 while (current != NULL){
    //printf("%s\n", current);
    if(strcmp(current, word) == 0)
        sum+=1;
    current = strtok(NULL, " ,.\n");
}
return sum;
}



int ffindWord(FILE *fp, char *word){

 fseek(fp, 0, SEEK_END);
 long pos = ftell(fp);
 fseek(fp, 0, SEEK_SET);
 char *bytes = malloc(pos);
 fread(bytes, pos, 1, fp);
 bytes[pos-1] = '\0';

 int sum = findWord(bytes, word);

 free(bytes);
 return sum;
 }

为了澄清，问题是我在连续运行程序时会得到不同的结果。一个电话 $ programname file1 file2 打印与之后调用的同一调用不同的结果。但请注意，只传递一个文件时程序才有效。

感谢任何帮助。

Answer 1

这导致未定义的行为，因为它超出了messages和threads数组的末尾：

Message messages[argc-1];
pthread_t threads[argc-1];
for(i; i < argc; i++){

可能是问题的原因。当只执行一个线程时，它可能会偶然发挥作用。

尝试更改为（或类似的）：

int i;
Message messages[argc-1];
pthread_t threads[argc-1];
for(i = 1; i < argc; i++)
{
    messages[i - 1].file = argv[i];
    messages[i - 1].word = word;
    messages[i - 1].fp   = fopen(argv[i], "r");
    int iret = pthread_create(&threads[i - 1],
                               NULL,
                               threadFindWord,
                               (void*)&(messages[i - 1]));
}

for(i = 0; i < argc - 1; i++)
{
    pthread_join(threads[i],NULL);
}

Answer 2

strtok保留一个全局的内部指针...使用strtok_r。

Answer 3

为了避免每个线程的输出以随机方式混合在一起，您需要缓冲每个线程的输出，然后一次显示它们。

对于您的情况，最简单的方法是在char *thread_output结构中添加thread_output_size字段（和Message字段），然后在您的内容中执行以下操作：主线：

for(i = 0; i < argc - 1; i++)
{
    pthread_join(threads[i],NULL);
    printf("%s", messages[i - 1].thread_output);
}

您可能还想要实现一个函数，确保thread_output缓冲区足够大，然后使用vsnprintf()将新文本添加到缓冲区，这样您就可以像使用它一样使用它会使用printf()。

例如：

void add_thread_output(int thread_number, const char *template, ...) {
    int old_length;
    int length;
    char *temp;

    va_start(ap, template);
    length = vsnprintf(NULL, 0, template, ap);
    va_end(ap);

    old_length = messages[thread_number - 1].thread_output_size;
    temp = realloc(messages[thread_number - 1].thread_output, old_length + length + 1);
    if(temp == NULL) {
        /* Set a flag or something */
    } else {
        va_start(ap, template);
        vsnprintf(&temp[old_length], length + 1, template, ap);
        va_end(ap);
        messages[thread_number - 1].thread_output_size += length;
        messages[thread_number - 1].thread_output = temp;
    }
}

注意：上面的任何示例代码仅用于示例目的，未经过测试或保证无法编译或工作，也不一定是最有效的方法。例如。分配比你需要的空间更多的空间（为了避免每次向线程的输出缓冲区添加内容时都做realloc()）可能是个好主意。

C中使用pthreads的竞争条件

3 个答案: