应用错误收集

我尝试使用同步分布式培训构建一个Inception V3模型。我正在使用此处的代码：https://github.com/tensorflow/models/tree/master/inception

我发现培训速度并没有提高，因为我使用了更多的GPU进行培训。

# of GPUs    Images processed/sec
1            18.88
2            33.92
4            66.48
8            85.44
16           114.8
32           116.224
64           167.104
128          319.616

我正在使用EC2 P2实例（p2.16xlarge）。以下是有关P2实例的更多信息：https://aws.amazon.com/ec2/instance-types/p2/

这是预期的加速比，还是有什么我可以做的比示例代码更能提高这种同步分布式训练的速度？

同步分布式培训很慢

0 个答案: