有没有人通过最近的mpy4py(以及pyCUDA 2015.1.3)通过MPI发送CUDA阵列? 要发送数组,必须将相应的数据类型转换为连续的缓冲区。使用以下lambda完成此转换:
if (yamNetwork != null) {
$.getScript("https://c64.assets-yammer.com/assets/platform_embed.js", function () {
// Load Yammer Comment control
yam.connect.embedFeed({
container: "#yammerComment",
network: "org.com",
feedType: "open-graph",
objectProperties: {
type: "page",
title: document.title.toString()
},
config: {
use_sso: true,
header: false,
footer: false,
promptText: "Comment on this page",
defaultToCanonical: true
}
});
}
);
}
完整脚本如下所示:
to_buffer = lambda arr: None if arr is None else lambda arr: arr.gpudata.as_buffer(arr.nbytes
但不幸的是,所有这些美女都因这些错误而崩溃:
import numpy
from mpi4py import MPI
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
import numpy
to_buffer = lambda arr: None if arr is None else lambda arr: arr.gpudata.as_buffer(arr.nbytes)
print "pyCUDA version " + str(pycuda.VERSION )
a_gpu = gpuarray.to_gpu(numpy.random.randn(4,4).astype(numpy.float32))
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
comm.Bcast([ to_buffer(agpu , MPI.FLOAT], root=0)
任何想法是怎么回事? 也许有人有另类的缓冲转换口头禅?
提前致谢!
答案 0 :(得分:1)
所需的只是使用有效的主机内存缓冲区对象或numpy数组调用MPI广播,例如:
comm.Bcast( a_gpu.get(), root=0)
代替lambda将DeviceAllocation
对象转换为缓冲区对象