应用错误收集

在tensorflow 2.0中实现DDPG

时间：2019-11-14 07:56:04

标签： tensorflow keras reinforcement-learning tensorflow2.0 q-learning

https://stackoverflow.com/a/52340133/11204016

在上面的答案中，我陷入了第七步。

如何将dQ / dA和dA / dTheta相乘？

dA / dTheta是网络权重的维度（Q netwok的Thetas）。

0 个答案:

没有答案