单个LSTM单元的代码如下(从Caffe src / layers复制)。我的问题是,哪个顶部输出连接到下一层(通常是嵌入或SoftMax)。
是C-top还是H-top?
从代码中,这里是C和H的定义。
Dtype* C = top[0]->mutable_cpu_data();
Dtype* H = top[1]->mutable_cpu_data();
LSTM单元代码
template <typename Dtype>
void LSTMUnitLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
const int num = bottom[0]->shape(1);
const int x_dim = hidden_dim_ * 4;
const Dtype* C_prev = bottom[0]->cpu_data();
const Dtype* X = bottom[1]->cpu_data();
const Dtype* cont = bottom[2]->cpu_data();
Dtype* C = top[0]->mutable_cpu_data();
Dtype* H = top[1]->mutable_cpu_data();
for (int n = 0; n < num; ++n) {
for (int d = 0; d < hidden_dim_; ++d) {
const Dtype i = sigmoid(X[d]);
const Dtype f = (*cont == 0) ? 0 :
(*cont * sigmoid(X[1 * hidden_dim_ + d]));
const Dtype o = sigmoid(X[2 * hidden_dim_ + d]);
const Dtype g = tanh(X[3 * hidden_dim_ + d]);
const Dtype c_prev = C_prev[d];
const Dtype c = f * c_prev + i * g;
C[d] = c;
const Dtype tanh_c = tanh(c);
H[d] = o * tanh_c;
}
C_prev += hidden_dim_;
X += x_dim;
C += hidden_dim_;
H += hidden_dim_;
++cont;
}
}
问题,
因此,如果我们定义一个类似下面的层,那么top(名为“lstm1
”)指的是哪个输出,C或H?
layer {
name: "lstm1"
type: "LSTM"
bottom: "fc6-reshape"
bottom: "reshape-cm"
top: "lstm1"
recurrent_param {
num_output: 8
weight_filler {
type: "uniform"
min: -0.01
max: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}