我在配置单元格initial_state
时遇到问题,因此我可以使用不同的批次大小进行训练和预测。从本质上讲,在训练中,我将喂入固定大小的微型批次,而在预测时,我将一次预测一个输入,然后将其输入模型以获取下一个输出。
但是,我无法创建具有可配置单元格initial_state
的第一维的图形。这是一个简单的model_fn
来模拟字符输入
def model_fn(features, labels, mode, params):
inputs = tf.one_hot(features, params["VOCAB_SIZE"], 1.0, 0.0)
cell = tf.nn.rnn_cell.MultiRNNCell([
tf.nn.rnn_cell.GRUCell(params["INTERNAL_SIZE"]) for _ in range(params["NUM_LAYERS"])
], state_is_tuple=False)
pkeep = params["DROPOUT_PKEEP"] if mode == tf.estimator.ModeKeys.TRAIN else 1.0
cell = tf.nn.rnn_cell.DropoutWrapper(cell, input_keep_prob=pkeep)
initial_state = tf.get_variable(
"initial_state",
dtype=tf.float32,
initializer=cell.zero_state(params["BATCH_SIZE"], dtype=tf.float32),
)
if mode == tf.estimator.ModeKeys.EVAL:
initial_state = cell.zero_state(1, dtype=tf.float32)
outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)
if mode != tf.estimator.ModeKeys.EVAL:
tf.assign(initial_state, final_state)
logits = ...
if mode == tf.estimator.ModeKeys.PREDICT:
logits = tf.reshape(logits, [-1, 1, 98])
else:
logits = tf.reshape(logits, [-1, features.shape[1], 98])
probabilities = tf.nn.softmax(logits)
predictions = tf.argmax(probabilities, 2)
if mode == tf.estimator.ModeKeys.PREDICT:
predictions = { "predictions": predictions, "probabilities": probabilities }
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
loss = ...
if mode == tf.estimator.ModeKeys.EVAL:
accuracy = tf.metrics.accuracy(labels=labels, predictions=predictions)
return tf.estimator.EstimatorSpec(mode, loss=loss, eval_metric_ops={
"accuracy": accuracy,
})
optimizer = tf.train.AdamOptimizer(learning_rate=params["LEARNING_RATE"])
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
问题在于,要求我在用于定义BATCH_SIZE
的参数中传递initial_state
,在训练时该值类似于200。但是在测试时,对于单个批次,则显示错误消息[1, 384] dimensional tensor cannot be assigned to a [200, 384] dimensional variable
。如何根据训练模式使initial_state
的尺寸可配置?