Question

我发现openmp不支持while循环（或者至少不太喜欢它们）。而且也不喜欢'！='运算符。

我有这段代码。

int count = 1;
#pragma omp parallel for
    while ( fgets(buff, BUFF_SIZE, f) != NULL )
    {
        len = strlen(buff);
        int sequence_counter = segment_read(buff,len,count);
        if (sequence_counter == 1)
        {
            count_of_reads++;
            printf("\n Total No. of reads: %d \n",count_of_reads);
        }
    count++;
    }

有关如何管理此问题的任何线索？我在某处读到（包括stackoverflow的另一篇文章），我可以使用管道。那是什么？以及如何实施它？

Answer 1

太糟糕了，人们选择最佳答案是如此之快。这是我的答案首先，你应该将文件读入一个像fread这样的缓冲区。这很快。有关如何执行此操作的示例，请访问http://www.cplusplus.com/reference/cstdio/fread/

然后，您可以与OpenMP并行运行缓冲区。我已经为你实现了大部分内容。下面是代码。你没有提供segment_read函数，所以我创建了一个虚拟函数。我使用了C ++中的一些函数，比如std :: vector和std :: sort，但是在纯C中你可以做更多的工作。

修改我编辑了这段代码，并且能够删除排序和关键部分。

我使用g++ foo.cpp -o foo -fopenmp -O3
编译
#include <stdio.h> #include <omp.h> #include <vector> using namespace std; int segment_read(char *buff, const int len, const int count) { return 1; } void foo(char* buffer, size_t size) { int count_of_reads = 0; int count = 1; std::vector<int> *posa; int nthreads; #pragma omp parallel { nthreads = omp_get_num_threads(); const int ithread = omp_get_thread_num(); #pragma omp single { posa = new vector<int>[nthreads]; posa[0].push_back(0); } //get the number of lines and end of line position #pragma omp for reduction(+: count) for(int i=0; i<size; i++) { if(buffer[i] == '\n') { //should add EOF as well to be safe count++; posa[ithread].push_back(i); } } #pragma omp for for(int i=1; i<count ;i++) { const int len = posa[ithread][i] - posa[ithread][i-1]; char* buff = &buffer[posa[ithread][i-1]]; const int sequence_counter = segment_read(buff,len,i); if (sequence_counter == 1) { #pragma omp atomic count_of_reads++; printf("\n Total No. of reads: %d \n",count_of_reads); } } } delete[] posa; } int main () { FILE * pFile; long lSize; char * buffer; size_t result; pFile = fopen ( "myfile.txt" , "rb" ); if (pFile==NULL) {fputs ("File error",stderr); exit (1);} // obtain file size: fseek (pFile , 0 , SEEK_END); lSize = ftell (pFile); rewind (pFile); // allocate memory to contain the whole file: buffer = (char*) malloc (sizeof(char)*lSize); if (buffer == NULL) {fputs ("Memory error",stderr); exit (2);} // copy the file into the buffer: result = fread (buffer,1,lSize,pFile); if (result != lSize) {fputs ("Reading error",stderr); exit (3);} /* the whole file is now loaded in the memory buffer. */ foo(buffer, result); // terminate fclose (pFile); free (buffer); return 0; }

Answer 2

在OpenMP中实现“并行时”的一种方法是使用创建任务的while循环。这是一般草图：

void foo() {
    while( Foo* f = get_next_thing() ) {
#pragma omp task firstprivate(f)
        bar(f);
    }
#pragma omp taskwait
}

对于循环遍历fgets的特定情况，请注意fgets具有固有的顺序语义（它获取“下一行”），因此需要在启动任务之前调用它。对每个任务来说，对fgets返回的数据副本进行操作也很重要，这样对fgets的调用不会覆盖前一个任务正在操作的缓冲区。

Answer 3

首先，即使它非常接近，但openmp并不能让你的代码平行。它与for一起使用，因为for具有可以理解的下限和上限。 Openmp使用这些边界来划分不同线程之间的工作。

while循环没有这种可能性。

其次，您如何期望并行化您的任务？您正在从一个文件中读取，其中顺序访问可能会比并行访问提供更好的性能。您可以并行segment_read（基于其实现）。

或者，您可能希望将文件读取与处理重叠。为此，您需要使用更多低级函数，例如Unix的open和read函数。然后，执行异步读取，这意味着您发送读取请求，处理最后一个读取块，然后等待读取请求完成。例如，搜索“linux asynchronous io”以了解更多信息。

使用管道实际上可能对你没什么帮助。这将取决于我不熟悉的许多管道内部。但是，如果您有足够大的内存，您可能还需要先考虑加载整个数据，然后再进行处理。这样，加载数据尽可能快（按顺序）完成，然后您可以并行处理它。

openmp - 用于文本文件读取和使用管道的while循环

3 个答案: