标签: tensorflow keras reinforcement-learning tensorflow2.0 q-learning
https://stackoverflow.com/a/52340133/11204016
在上面的答案中,我陷入了第七步。
如何将dQ / dA和dA / dTheta相乘?