OpenCL中带有“clEnqueueNDRangeKernel”的奇怪的Timinig问题

时间:2012-07-29 07:02:27

标签: profiling opencl

我是opencl的新手,我遇到了一个奇怪的问题!我有一个减少内核,我重复几次!问题在于,当我描述内核的执行时,经过的时间(queued-> end)几乎是相同的并且有点增加但是当我在" C ++"中测量elasped时间时。编写执行行的时间" clEnqueueNDRangeKernel"快速增加!!我已经附加了分析的代码和输出! :shock:

// execute the kernel
globalWorkSize[0] = this->reduction_NumBlocks * this->reduction_NumThreads;
localWorkSize[0] = this->reduction_NumThreads;

//Start Time
ttt.start();

clErrNum = clEnqueueNDRangeKernel(clCommandQueue, kernelReduction, 1, 0,
globalWorkSize, localWorkSize, 0, NULL, &timing_event);
// check if kernel execution generated an error
oclCheckError(clErrNum, CL_SUCCESS);

clFinish(clCommandQueue);
ttt.stop();

//Check Elapsed Time
clGetEventProfilingInfo(timing_event, CL_PROFILING_COMMAND_QUEUED,
sizeof(time_start), &time_start, NULL);
clGetEventProfilingInfo(timing_event, CL_PROFILING_COMMAND_END,
sizeof(time_end), &time_end, NULL);
cout<<"ElapseTime(Execute):"<<(time_end - time_start)/1000<<"us\tTTT:"    <<ttt.getElapsedTimeInMicroSec()<<endl;

这是输出:

    GeForce GTX 550 Ti
    Device Timer Resolution:1000ns
    GpuExecutionTime:160us   C++ElapsedTime:177
    GpuExecutionTime:156us   C++ElapsedTime:167
    GpuExecutionTime:156us   C++ElapsedTime:166
    GpuExecutionTime:189us   C++ElapsedTime:242
    GpuExecutionTime:158us   C++ElapsedTime:215
    ...
    GpuExecutionTime:156us   C++ElapsedTime:253
    GpuExecutionTime:162us   C++ElapsedTime:261
    GpuExecutionTime:157us   C++ElapsedTime:262
    GpuExecutionTime:156us   C++ElapsedTime:254
    GpuExecutionTime:157us   C++ElapsedTime:254
    GpuExecutionTime:160us   C++ElapsedTime:261
    GpuExecutionTime:167us   C++ElapsedTime:279
    GpuExecutionTime:157us   C++ElapsedTime:264
    ...
    GpuExecutionTime:159us   C++ElapsedTime:263
    GpuExecutionTime:157us   C++ElapsedTime:261
    GpuExecutionTime:157us   C++ElapsedTime:260
    GpuExecutionTime:157us   C++ElapsedTime:263
    GpuExecutionTime:264us   C++ElapsedTime:384
    ...
    GpuExecutionTime:156us   C++ElapsedTime:304
    GpuExecutionTime:161us   C++ElapsedTime:314
    GpuExecutionTime:157us   C++ElapsedTime:308
    GpuExecutionTime:160us   C++ElapsedTime:305
    GpuExecutionTime:158us   C++ElapsedTime:311
    GpuExecutionTime:156us   C++ElapsedTime:308
    GpuExecutionTime:157us   C++ElapsedTime:312
    ...
    GpuExecutionTime:157us   C++ElapsedTime:326
    GpuExecutionTime:158us   C++ElapsedTime:326
    GpuExecutionTime:159us   C++ElapsedTime:330
    GpuExecutionTime:158us   C++ElapsedTime:328
    GpuExecutionTime:158us   C++ElapsedTime:335

感谢任何形式的帮助。

P.S。输入和其他相关可变物的大小是固定的!

0 个答案:

没有答案