我已按照Cloud TPU Tools上的说明操作。除了必须将--tpu_name更改为--tpu的步骤4之外,事情似乎按预期工作。
失败的是" Profile"标签。我执行了
capture_tpu_profile --tpu_name=$TPU_NAME --logdir=${model_dir}
产生了
Welcome to the Cloud TPU Profiler v1.6.0
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
Profile session succeed for host(s):10.240.1.2
我多次刷新/重新启动了TensorBoard,但是没有" Profile"选项卡并单击"个人资料"从下拉菜单中返回没有生成的数据。
这是Cloud TPU Profiler的已知问题吗?
- 编辑1 -
Profiler v 1.5.2在收集跟踪事件时失败。
Welcome to the Cloud TPU Profiler v1.5.2
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 2
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 1
Limiting the number of trace events to 1000000
No trace event is collected after 3 attempt(s). Perhaps, you want to try again (with more attempts?).
Tip: increase number of attempts with --num_tracing_attempts.
答案 0 :(得分:1)
您可以使用Cloud TPU Profiler 1.5.2再试一次吗?
pip install cloud-tpu-profiler == 1.5.2
Cloud TPU Profiler 1.6.0和工作列表功能仅在tensorflow的当前主分支中受支持,而在使用以下命令时向后兼容tf-1.8 capture_tpu_profile -service_addr = 10.240.1.2 -logdir = $ {model_dir}