我尝试构建一个非常简单的NN模型来执行分类任务。我尝试用客户化的方式建立模型。我想使用tf.Data.Dataset
加载我的数据集。然后,我用mini-batch
的方式训练模型。同时,我想在验证数据集上打印模型结果。因此,我尝试重用变量。我的模型如下:
def get_loss(prediction, label):
return tf.losses.softmax_cross_entropy(tf.expand_dims(label, -1), prediction)
def make_train_op(optimizer, loss):
apply_gradient_op = optimizer.minimize(loss,)
return apply_gradient_op
class Model:
def __init__(self):
self.model = tf.keras.Sequential([
tf.keras.layers.Dense(32, input_shape=(3,), activation=tf.keras.activations.relu),
tf.keras.layers.Dense(128, input_shape=(64,), activation=tf.keras.activations.relu),
tf.keras.layers.Dense(1, input_shape=(128,), activation=tf.keras.activations.softmax)
])
def __call__(self, inp, is_train=True):
return self.model(inp.feature), inp.label
然后我尝试如下训练该模型:
model = Model()
optimizer = tf.train.AdamOptimizer()
init = tf.global_variables_initializer()
global_step = tf.train.get_or_create_global_step()
with tf.variable_scope('input', reuse=True):
training_inp = InputPipe()
validate_inp = InputPipe(is_train=False)
scope = tf.get_variable_scope()
training_prediction, true_train_y = model(training_inp)
scope.reuse_variables()
total_instances = data_size * n_repeats
steps_per_epoch = data_size // batch_size if data_size / batch_size == 0 else data_size // batch_size + 1
with tf.Session() as sess:
sess.run(init)
training_inp.init_sess(sess)
list_grads = []
for epoch in range(n_repeats):
tqr = range(steps_per_epoch)
for _ in tqr:
loss = get_loss(training_prediction, true_train_y)
sess.run(make_train_op(optimizer, loss))
但是,optimizer.minize(loss)
引发异常:
ValueError:变量密/内核/亚当/不存在,或者不是使用tf.get_variable()创建的。您是要在VarScope中设置复用= tf.AUTO_REUSE吗?
更新:
当我循环外调用get_loss
和make_train_op
时。它引发了关于FailedPreconditionError
的另一个错误,但是,我已经初始化了所有变量:
FailedPreconditionError(请参阅上面的回溯):
从容器:本地主机读取资源变量beta2_power时出错。这可能意味着该变量未初始化。找不到:资源localhost / beta2_power / class tensorflow :: Var不存在。 [[节点Adam / update_dense_2 / kernel / ResourceApplyAdam / ReadVariableOp_1
(在D:/ 00程序/python_ai/model/traffic_prediction_1/trainer_test_1.py:16处定义)]]
第16行是:
apply_gradient_op = optimizer.minimize(loss, )
答案 0 :(得分:1)
我认为问题在于,您正在循环中调用get_loss
和make_train_op
,这会造成多个损失和优化操作。改为这样做:
model = Model()
optimizer = tf.train.AdamOptimizer()
init = tf.global_variables_initializer()
global_step = tf.train.get_or_create_global_step()
with tf.variable_scope('input', reuse=True):
training_inp = InputPipe()
validate_inp = InputPipe(is_train=False)
training_prediction, true_train_y = model(training_inp)
loss = get_loss(training_prediction, true_train_y)
train_op = make_train_op(optimizer, loss)
total_instances = data_size * n_repeats
steps_per_epoch = data_size // batch_size if data_size / batch_size == 0 else data_size // batch_size + 1
with tf.Session() as sess:
sess.run(init)
training_inp.init_sess(sess)
list_grads = []
for epoch in range(n_repeats):
tqr = range(steps_per_epoch)
for _ in tqr:
sess.run(train_op)