我的模型运行得很慢,所以我尝试了TF profiler。现在给出输出,我很难确定使用Python API创建Sub
和AssignSub
的位置,并了解他们为什么要使用这么多CPU时间。任何帮助深表感谢!提前谢谢。
profiler.advise()
输出以下内容
ExpensiveOperationChecker:
top 1 operation type: Sub, cpu: 949.78ms, accelerator: 41.98ms, total: 994.15ms (17.51%)
top 2 operation type: AssignSub, cpu: 578.64ms, accelerator: 26.61ms, total: 606.63ms (10.69%)
top 3 operation type: Conv2D, cpu: 173.67ms, accelerator: 214.06ms, total: 388.10ms (6.84%)
top 1 graph node: gradients, cpu: 0us, accelerator: 0us, total: 0us
top 2 graph node: densenet121_6, cpu: 0us, accelerator: 0us, total: 0us
top 3 graph node: densenet121_4, cpu: 0us, accelerator: 0us, total: 0us
model_local_global.py:939:<module>, cpu: 3.93sec, accelerator: 1.06sec, total: 5.00sec
model_local_global.py:939:<module> (gradient), cpu: 149.26ms, accelerator: 298.11ms, total: 447.78ms