我正在尝试在openCL内核中进行一些双精度数学运算。我启用了cl_khr_fp64,我可以在double中执行简单的+ - * /操作,但是当我尝试使用内置的数学函数(例如,exp)时,代码无法编译。如果我切换到浮动它可以工作。
我在Khronos网站上读到,如果你启用了cl_khr_fp64,那么数学函数会被重载以支持double。 (http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/mathFunctions.html)
我正在使用D700 GPU开发新的MacPro,我读到只有一些数学函数被重载以包含double(http://developer.amd.com/knowledge-base/),但是指数在包含的函数列表中。此外,当我将此代码发送到CPU而不是GPU时,它也会失败。
这是我内核的代码。它只是Hello World的延伸! Apple开发者网站上的示例。如果我将u切换为浮点数,或者如果我将exp(u)替换为u(如条件的第一部分),它也可以。这只是一个玩具问题,我在开始实现我的真实代码之前试图开始工作,但是我必须在移动之前让它工作。我还尝试过double_exp,expd和native_exp。它们都导致代码编译,但后来我遇到了运行时错误。
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
kernel void square5(global double* input, global double* output, double mul)
{
size_t i = get_global_id(0);
double u = 5.0;
if(i==0) {
output[i] = mul*u*input[i]*input[i];
} else if (i==1023) {
output[i] = mul*u*input[i]*input[i];
} else {
output[i] = 0.25*mul*exp(u)*(input[i-1] + input[i+1])*(input[i-1] + input[i+1]);
}
}
这是错误日志
/Users/me/Downloads/OpenCL_Hello_World_Example/mykernel.cl:12:30: error: call to '__fast_relax_exp' is ambiguous
output[i] = 0.25*mul*exp(u)*(input[i-1] + input[i+1])*(input[i-1] + input[i+1]);
^~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4496:22: note: expanded from macro 'exp'
#define exp(__x) __fast_relax_exp(__x)
^~~~~~~~~~~~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:416:27: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float __OVERLOAD__ _name(float x) { return _default_name(x); } \
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:417:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float2 __OVERLOAD__ _name(float2 x) { return _default_name(x); } \
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:418:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float3 __OVERLOAD__ _name(float3 x) { return _default_name(x); } \
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:419:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float4 __OVERLOAD__ _name(float4 x) { return _default_name(x); } \
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:420:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float8 __OVERLOAD__ _name(float8 x) { return _default_name(x); } \
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
__CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
^
/System/Library/Frameworks/OpenCL.framework/Versions/A/Libraries/../lib/clang/3.2/include/cl_kernel.h:421:29: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float16 __OVERLOAD__ _name(float16 x){ return _default_name(x); }
^
1 error generated.
Command /System/Library/Frameworks/OpenCL.framework/Libraries/openclc failed with exit code 1