足够简单
start=cuda.Event()
func(args,block=blockdims)
cuda.memcpy_dtoh(d,h)
end=cuda.Event()
dur=start.time_till(end)
print dur
但是我收到了这个错误
File "gpu.py", line 161, in gpu_test
dur=start.time_till(end)
pycuda._driver.LogicError: cuEventElapsedTime failed: invalid handle
据我所知docs正确用法。任何人都知道我做错了什么?
答案 0 :(得分:1)
start=cuda.Event()
end=cuda.Event()
start.record() # start timing
func(args,block=blockdims)
cuda.memcpy_dtoh(d,h)
end.record() # end timing
# calculate the run length
end.synchronize()
millis = start.time_till(end)
print millis