Question

我正在尝试将GRU中的最后一个状态提供给初始状态。因此，我的模型定义如下：

def model(input_features, batch_size):

    with tf.variable_scope("GRU_Layer1"):
        cell1 = tf.nn.rnn_cell.GRUCell(gru1_cell_size)
        initial_state1 = tf.Variable(cell1.zero_state(batch_size, dtype=tf.float32), validate_shape=False, trainable=False, name="initial_state1")

        output1, new_state1 = tf.nn.dynamic_rnn(cell1, input_features, dtype=tf.float32, initial_state=initial_state1)
        with tf.control_dependencies([initial_state1.assign(new_state1)]):
            output1 = tf.identity(output1)

    with tf.variable_scope("GRU_Layer2"):
        cell2 = tf.nn.rnn_cell.GRUCell(gru2_cell_size)
        initial_state2 = tf.Variable(cell2.zero_state(batch_size, dtype=tf.float32), validate_shape=False, trainable=False, name="initial_state2")
        output2, new_state2 = tf.nn.dynamic_rnn(cell2, output1, dtype=tf.float32, initial_state=initial_state2)
        with tf.control_dependencies([initial_state2.assign(new_state2)]):
            output2 = tf.identity(output2)

    with tf.variable_scope("output2_reshaped"):
        # before, shape: (34, 1768, 32), after, shape: (34 * 1768, 32)
        output2 = tf.reshape(output2, shape=[-1, gru2_cell_size])

    with tf.variable_scope("output_layer"):
        # shape: (34 * 1768, 3)
        predictions = output_layer(output2, num_labels)
        predictions = tf.reshape(predictions, shape=[-1, max_seq_length, 3])
        return predictions

我的问题：训练和验证数据集中的批次大小不同，分别是34和14。因此，这将产生以下错误：

InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [14,200] vs. shape[1] = [34,64]
     [[Node: GRU_Layer1/rnn/while/gru_cell/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GRU_Layer1/rnn/while/TensorArrayReadV3, GRU_Layer1/rnn/while/Switch_3:1, GRU_Layer1/rnn/while/gru_cell/split/split_dim)]]
     [[Node: scores_arousal/Mean_6/_227 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_403_scores_arousal/Mean_6", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

gru1_cell_size = 64和gru2_cell_size = 32所在的位置。

传递的batch_size由以下代码生成：

def return_34():
    return 34
def return_14():
    return 14
with tf.name_scope("batch_size"):
    batch_size_ = tf.cond(phase_train, lambda: return_34(), lambda: return_14())
predictions = model(model_input, batch_size_)

由于批次大小是变化的，因此我决定在创建validate_shape=False中的initial_state时使用GRU。

非常感谢您的帮助！

将初始状态设置为GRU张量流中的最终状态，同时在训练和开发数据集之间具有不同的批处理大小

0 个答案: