我正在尝试使用culaSgels函数来解决Ax = B.
我修改了cula包的systemSolve示例。
void culaFloatExample()
{
int N=2;
int NRHS = 2;
int i,j;
double cula_time,start_time,end_time;
culaStatus status;
culaFloat* A = NULL;
culaFloat* B = NULL;
culaFloat* X = NULL;
culaFloat one = 1.0f;
culaFloat thresh = 1e-6f;
culaFloat diff;
printf("Allocating Matrices\n");
A = (culaFloat*)malloc(N*N*sizeof(culaFloat));
B = (culaFloat*)malloc(N*N*sizeof(culaFloat));
X = (culaFloat*)malloc(N*N*sizeof(culaFloat));
if(!A || !B )
exit(EXIT_FAILURE);
printf("Initializing CULA\n");
status = culaInitialize();
checkStatus(status);
// Set A
A[0]=1;
A[1]=2;
A[2]=3;
A[3]=4;
// Set B
B[0]=5;
B[1]=6;
B[2]=2;
B[3]=3;
printf("Calling culaSgels\n");
// Run CULA's version
start_time = getHighResolutionTime();
status = culaSgels('N',N,N, NRHS, A, N, A, N);
end_time = getHighResolutionTime();
cula_time = end_time - start_time;
checkStatus(status);
printf("Verifying Result\n");
for(i = 0; i < N; ++i){
for (j=0;j<N;j++)
{
diff = X[i+j*N] - B[i+j*N];
if(diff < 0.0f)
diff = -diff;
if(diff > thresh)
printf("\nResult check failed: X[%d]=%f B[%d]=%f\n", i, X[i+j*N],i, B[i+j*N]);
printf("\nResults:X= %f \t B= %f:\n",X[i+j*N],B[i+j*N]);
}
}
printRuntime(cula_time);
printf("Shutting down CULA\n\n");
culaShutdown();
free(A);
free(B);
}
我正在使用culaSgels('N',N,N, NRHS, A, N, A, N);
解决系统问题,但是:
1)结果显示X = 0的每个元素,但B是正确的。 此外,它向我展示了
结果检查失败消息
2)研究参考手册,它说最后一个参数之前的一个参数(AI有),应该是矩阵B存储的列,但如果我用“B”而不是“A”作为参数,那么我我没有得到正确的B矩阵。
答案 0 :(得分:0)
好的,代码需要处理3件事。
1)将A更改为B,因此culaSgels('N',N,N, NRHS, A, N, B, N);
(我误解了在B出口处包含解决方案)
2)因为CULA使用了列主要变化A,B矩阵。
3)改为:
B = (culaFloat*)malloc(N*NRHS*sizeof(culaFloat));
X = (culaFloat*)malloc(N*NRHS*sizeof(culaFloat));
(使用NHRS而不是本例中相同的N)
谢谢!