我在Python代码中使用了TensorRT。所以我用PyCUDA。
在以下推论代码中,an illegal memory access was encountered
处发生了stream.synchronize()
。
def infer(engine, x, batch_size, context):
inputs = []
outputs = []
bindings = []
stream = cuda.Stream()
for binding in engine:
size = trt.volume(engine.get_binding_shape(binding)) * batch_size
dtype = trt.nptype(engine.get_binding_dtype(binding))
# Allocate host and device buffers
host_mem = cuda.pagelocked_empty(size, dtype)
device_mem = cuda.mem_alloc(host_mem.nbytes)
# Append the device buffer to device bindings.
bindings.append(int(device_mem))
# Append to the appropriate list.
if engine.binding_is_input(binding):
inputs.append(HostDeviceMem(host_mem, device_mem))
else:
outputs.append(HostDeviceMem(host_mem, device_mem))
img = np.array(x).ravel()
np.copyto(inputs[0].host, 1.0 - img / 255.0)
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
return [out.host for out in outputs]
怎么了?
编辑: 我的程序是Tensorflow和TensorRT代码的组合。 该错误仅在我运行时发生
self.graph = tf.get_default_graph()
self.persistent_sess = tf.Session(graph=self.graph, config=tf_config)
在运行infer()之前。如果我不执行上述两行,就没有问题。
答案 0 :(得分:0)
这里的问题是我有两个python代码。 说tensorrtcode.py和tensorflowcode.py。
tensorrtcode.py has
仅张量代码。
def infer(engine, x, batch_size, context):
inputs = []
outputs = []
bindings = []
stream = cuda.Stream()
for binding in engine:
size = trt.volume(engine.get_binding_shape(binding)) * batch_size
dtype = trt.nptype(engine.get_binding_dtype(binding))
# Allocate host and device buffers
host_mem = cuda.pagelocked_empty(size, dtype)
device_mem = cuda.mem_alloc(host_mem.nbytes)
# Append the device buffer to device bindings.
bindings.append(int(device_mem))
# Append to the appropriate list.
if engine.binding_is_input(binding):
inputs.append(HostDeviceMem(host_mem, device_mem))
else:
outputs.append(HostDeviceMem(host_mem, device_mem))
img = np.array(x).ravel()
np.copyto(inputs[0].host, 1.0 - img / 255.0)
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
return [out.host for out in outputs]
def main():
.....
infer(......)
.....
然后 tensorflowcode.py has
仅使用tensorflow api并使用session
执行。
self.graph = tf.get_default_graph()
self.persistent_sess = tf.Session(graph=self.graph, config=tf_config)
问题是当我需要将类从tensorflow连接到tensorrt类时, 在tensorrt的main中将tensorflow代码的类实例声明为
def main(): ..... t_flow_code = tensorflowclass() 推断(......) .....
然后我遇到了illegal memory access was encountered happened at stream.synchronize()
通过添加another session at tensorrt just before t_flow_code=tensorflowclass().
我不明白为什么需要它,因为我有自己的会话在tensorflow类上执行。为什么我在Tensorrt代码的类接口之前需要另一个会话。