Question

我有一个用PyTorch编写的神经网络，它在GPU上输出一些Tensor a。我想继续使用高效的TensorFlow层处理a。

据我所知，唯一的方法是将a从GPU内存移至CPU内存，转换为numpy，然后将其输入TensorFlow。简化示例：

import torch
import tensorflow as tf

# output of some neural network written in PyTorch
a = torch.ones((10, 10), dtype=torch.float32).cuda()

# move to CPU / pinned memory
c = a.to('cpu', non_blocking=True)

# setup TensorFlow stuff (only needs to happen once)
sess = tf.Session()
c_ph = tf.placeholder(tf.float32, shape=c.shape)
c_mean = tf.reduce_mean(c_ph)

# run TensorFlow
print(sess.run(c_mean, feed_dict={c_ph: c.numpy()}))

这也许有些牵强，但有办法做到这一点

a永远不会离开GPU内存，或者
a从GPU内存到固定内存再到GPU内存。

我尝试使用non_blocking=True在上面的代码片段中进行过2.但我不确定它是否符合我的期望（即，将其移至固定的内存中）。

理想情况下，我的TensorFlow图将直接在PyTorch张量所占用的内存上运行，但我认为那不可能吗？

Answer 1

我对tensorflow不熟悉，但是您可以使用pyTorch公开张量的“内部”。
您可以访问张量的基础storage

a.storage()

一旦有了存储空间，就可以获得指向内存（CPU或GPU）的指针：

a.storage().data_ptr()

您可以检查它是否固定

a.storage().is_pinned()

您可以固定它

a.storage().pin_memory()

我不熟悉pyTorch和tensorflow之间的接口，但是遇到了example包中的一个FAISS，可直接访问GPU中的pytorch张量。

直接从TensorFlow访问PyTorch GPU矩阵

1 个答案: