我想在Pytorch中使用CUDA流来并行一些计算,但是我不知道该怎么做。 例如,如果有两个任务A和B需要并行化,我想做以下事情:
stream0 = torch.get_stream()
stream1 = torch.get_stream()
with torch.now_stream(stream0):
// task A
with torch.now_stream(stream1):
// task B
torch.synchronize()
// get A and B's answer
如何在真实的python代码中实现目标?
答案 0 :(得分:2)
s1 = torch.cuda.Stream()
s2 = torch.cuda.Stream()
# Initialise cuda tensors here. E.g.:
A = torch.rand(1000, 1000, device = ‘cuda’)
B = torch.rand(1000, 1000, device = ‘cuda’)
# Wait for the above tensors to initialise.
torch.cuda.synchronize()
with torch.cuda.stream(s1):
C = torch.mm(A, A)
with torch.cuda.stream(s2):
D = torch.mm(B, B)
# Wait for C and D to be computed.
torch.cuda.synchronize()
# Do stuff with C and D.