我是python编程的新学习者。最近我试图写一个"工具"程序"动态编程"算法。但是,我的程序的最后一部分 - $("#expList ol li").click(function(e) {
e.stopPropagation();
});
循环,无法循环。代码就像
while
如您所见,底部的import numpy as np
beta, rho, B, M = 0.5, 0.9, 10, 5
S = range(B + M + 1) # State space = 0,...,B + M
Z = range(B + 1) # Shock space = 0,...,B
def U(c):
"Utility function."
return c**beta
def phi(z):
"Probability mass function, uniform distribution."
return 1.0 / len(Z) if 0 <= z <= B else 0
def Gamma(x):
"The correspondence of feasible actions."
return range(min(x, M) + 1)
def T(v):
"""An implementation of the Bellman operator.
Parameters: v is a sequence representing a function on S.
Returns: Tv, a list."""
Tv = []
for x in S:
# Compute the value of the objective function for each
# a in Gamma(x), and store the result in vals (n*m matrix)
vals = []
for a in Gamma(x):
y = U(x - a) + rho * sum(v[a + z]*phi(z) for z in Z)
# the place v comes into play, v is array for each state
vals.append(y)
# Store the maximum reward for this x in the list Tv
Tv.append(max(vals))
return Tv
# create initial value
def v_init():
v = []
for i in S:
val = []
for j in Gamma(i):
# deterministic
y = U(i-j)
val.append(y)
v.append(max(val))
return v
# Create an instance of value function
v = v_init()
# parameters
max_iter = 10000
tol = 0.0001
num_iter = 0
diff = 1.0
N = len(S)
# value iteration
value = np.empty([max_iter,N])
while (diff>=tol and num_iter<max_iter ):
v = T(v)
value[num_iter] = v
diff = np.abs(value[-1] - value[-2]).max()
num_iter = num_iter + 1
循环用于迭代&#34;值函数&#34;并找到正确的答案。但是,while无法循环,只返回while
。据我所知,while循环&#34;重复一系列语句,直到某些条件变为false&#34;显然,直到diff收敛到0附近才会满足这个条件
代码的主要部分工作正常,只要我使用以下num_iter=1
循环
for
答案 0 :(得分:0)
您将value
定义为np.empty(...)
。这意味着它完全由零组成。因此,最后一个元素和倒数第二个元素之间的差异将为零。 0
不 >= 0.0001
,因此表达式为False
。因此,你的循环中断。