Question

所以，在我之前尝试使用openMP之后，我意识到我没有任何代码的例子，这些代码在并行化时比我的系统实际运行得更快。下面是一个尝试（失败）的简短示例，首先显示确实存在两个核心，并且openMP正在使用它们，然后计时两个脑死亡任务，一个使用openMP而另一个没有。我正在测试的任务很可能出现问题，所以如果有人可以提出另一个完整性测试，我会很感激，因此我可以亲眼看到多线程CAN工作：）

#include <iostream>
#include <vector>
#include <ctime>
#include <cmath>

using namespace std;

#include <omp.h>

int main(int argc, char *argv[])
{


    //Below code will be run once for each processor (there are two)
    #pragma omp parallel 
    {
        cout << omp_get_thread_num() << endl; //this should output 1 and 0, in random order
    }


    //The parallel example:
    vector <double> a(50000,0);

    clock_t start = clock();
#pragma omp parallel for  shared(a) 
    for (int i=0; i < 50000; i++)    
    {
        double StartVal=i;

        for (int j=0; j<2000; ++j)
            a[i]=(StartVal + log(exp(exp((double) i)))); 
    } 

    cout<< "Time: " << ( (double) ( clock() - start ) / (double)CLOCKS_PER_SEC ) <<endl;

    //The serial example:
    start = clock();

    for (int i=0; i < 50000; i++)    
    {
        double StartVal=i;

        for (int j=0; j<2000; ++j)
            a[i]=(StartVal + log(exp(exp((double) i)))); 
    } 

    cout<< "Time: " << ( (double) ( clock() - start ) / (double)CLOCKS_PER_SEC ) <<endl;

    return 0;
}

输出是：

    1
    0
    Time: 4.07
    Time: 3.84

可能与openlo缺少的forloop优化有关吗？或者我如何测量时间有什么问题？在这种情况下，您对其他测试有什么想法吗？

提前谢谢你：）

编辑：事实证明我确实以一种糟糕的方式衡量时间。使用omp_get_wtime()，输出变为：

1
0
Time: 4.40776
Time: 7.77676

我想我最好再回过头来看看我的旧问题......

Answer 1

我能想到两种可能性：

如果您在Linux上运行。 clock()不测量Linux上的挂起时间。它测量CPU时间。
我建议您改用omp_get_wtime()。
你的测试不够大。尝试将2000增加到200000。

以下是我在Windows上使用内循环上的200000次迭代获得的内容：

4
5
2
3
1
6
7
0
Time: 1.834
Time: 6.792

我对this question的回答有一个非常简单的OpenMP示例，可以实现加速。

Answer 2

正如神秘所说，这也许就是你衡量的方式。在linux中，您可以使用clock_gettime。使用具有4个核心的linux虚拟机（我有6个物理）我的代码速度提高了3倍，即使内部循环小到j <20。

#include <sys/time.h>

int main( ... ) ... same as your code ...    
   timespec ts1;
   timespec ts2;

   //start measurement:
   clock_gettime(CLOCK_REALTIME, &ts1);

   ... code to time here ...

   //stop measurement:

   clock_gettime(CLOCK_REALTIME, &ts2);

   cout<< "clock Time s: " << (ts2.tv_sec-ts1.tv_sec) + 1e-9*( ts2.tv_nsec-ts1.tv_nsec ) <<endl;

  ... }

我需要帮助创建一个显示我的系统上的openMP加速的最小示例

2 个答案: