Question

我对模型并行性感到好奇，并且我已经从Yaroslav Bulatov读取了代码。在该示例中，我们应该手动将模型（或称为Graph的张量流）分区到不同的分区（left_network＆amp; right_network）。所以，我想知道我是否必须手动创建分区，simple_placer.cc和graph_partition.cc对整个图表做了什么？而且我还不清楚。
在我的想法中（让我知道是否有任何错误）：如果图形有8个分区（子图），可以看作8个作业，4个工作者，分配给工人的分区如何通过以下方式完成：
- 通过tf.device()或
- 分布式培训，tf.train.replica_device_setter()
  
  跨参数服务器共享变量，否则全部放入 ops on the worker device

但图表如何制作分区？我想跟踪子图（操作节点集）的样子？我可以转储结果，还是需要跟踪/修改哪个代码文件？

如果有任何概念错误或模糊，请告诉我。我是其中的新手，任何意见都表示赞赏。

在下面的代码中，matmul是一个操作节点，它是否会被分区不同的工作？

y_ = tf.placeholder(tf.float32, [None, 10])
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W)  + b

Answer 1

您可以在致电tf.Session.run()

时传递其他选项，从而获得展示位置算法的结果

# ...
y = tf.matmul(x, W) + b

sess = tf.Session()
options = tf.RunOptions(output_partition_graphs=True)
metadata = tf.RunMetadata()

sess.run(y, options=options, run_metadata=metadata)

# `metadata` now contains information about what happened during the `run()` call.
for partition in metadata.partition_graphs:

  # `partition` is a `tf.GraphDef` representing all the nodes that ran on a single
  # device. All nodes in `partition` have the same `device` value.
  device = partition.node[0].device

  for node in partition.node:
    # e.g. print each node or store it in a dictionary for further analysis.
    # ...

Tensorflow如何转储结果放置算法

1 个答案: