更好的CPU运行时间,可以轻松修改以支持GPU

时间:2018-06-23 08:49:30

标签: c++ gpu

我对c ++还是比较陌生,目前正在研究一些数学应用程序,目的只是为了提高我在c ++中的技能。

我有两个范式可以采用,我不知道哪种方法在CPU上是有效的,而哪种方法很容易扩展以支持GPU。

请参见下面的代码:

/*
Background information:

MyDataArray<double> is a class that has a vector of data type double as a member. 
The vector length will be at least **20,000** (usually more).
I want to support operations (+,/,*,-, etc)
All operations listed above would be done element wise (over entire length)
e.g 

Question: Which of the two paradigm do you think would be
(1) more efficient on a CPU. 
(2) easy to extend to have GPU support (e.g CUDA C++)

*/
MyDataArray<double> T; 
MyDataArray<double> V;
MyDataArray<double> Q;
MyDataArray<double> D;

范例1:不执行任何操作,而是建立一个隐式图。

/* 
OpsExpression<double> is just a class that keeps information about the 
operations involving objects of MyDataArray<double>. 
it stores it as an implicit graph of the operation. 
The implicit graph is then executed when the '.evaluate()' method is called.
 */

OpsExpression<double> exp_obj = T*V*(Q + D)*(V + T) ; //nothing executes. just builds an implicit graph of the operation. 

//... when the '.evaluate' method is called, exp_obj executes something 
// similar to the below:

double value;
for (int i = 0; i < N; i++)
{
    value = V.getValue(i) + T.getValue(i);
    value = (Q.getValue(i) + D.getValue(i)) * value;
    value = V.getValue(i) * value;
    value = T.getValue(i) * value;

    exp_obj.setValue(i,value);
}

// iterative scheme
for (int iteration = 0; iteration < 3000; iteration++)
{
    // ... do other similar operations 

    exp_obj.evaluate();

    // ... do other similar operations 
}

范例2:执行每个操作。

/*
each operation is executed and the result is stored in a temporary MyDataArray<double> object.
*/

// iterative scheme 
for (int iteration = 0; iteration < 3000; iteration++ )
{
    //... do some other similar operations here 

    MyDataArray<double> tmp_v1 = V + T;
    MyDataArray<double> tmp_v2 = Q + D; 
    MyDataArray<double> tmp_v3 = tmp_v1 * tmp_v2; 
    MyDataArray<double> tmp_v4 = V * tmp_v3;
    MyDataArray<dobule> final_answer = T * tmp_v4;; 

    //...do some other similar operations here. 
}

在数值方法中,这些运算将是迭代方案的一部分。这意味着将随着变量T,Q,V每次迭代更改一次又一次地调用它。

非常感谢您的帮助以及其他解决问题的方法。

0 个答案:

没有答案