使用Keras获取LSTM网络的单元,输入门,输出门和忘记门激活值

时间:2019-01-17 13:07:26

标签: python tensorflow keras deep-learning keras-layer

我想获得经过训练的LSTM网络的给定输入的激活值,特别是单元,输入门,输出门和忘记门的值。根据这个Keras issue和这个Stackoverflow question,我可以使用以下代码获得一些激活值:

(基本上,我试图使用每个时间序列的一个标签对一维时间序列进行分类,但这对于这个一般性问题并不重要)

import random
from pprint import pprint

import keras.backend as K
import numpy as np
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.models import Sequential
from keras.utils import to_categorical

def getOutputLayer(layerNumber, model, X):
    return K.function([model.layers[0].input],
                      [model.layers[layerNumber].output])([X])

model = Sequential()
model.add(LSTM(10, batch_input_shape=(1, 1, 1), stateful=True))
model.add(Dense(2, activation='softmax'))
model.compile(
    loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

# generate some test data
for i in range(10):
    # generate a random timeseries of 100 numbers
    X = np.random.rand(10)
    X = X.reshape(10, 1, 1)

    # generate a random label for the whole timeseries between 0 and 1
    y = to_categorical([random.randint(0, 1)] * 10, num_classes=2)

    # train the lstm for this one timeseries
    model.fit(X, y, epochs=1, batch_size=1, verbose=0)
    model.reset_states()

# to keep the output simple use only 5 steps for the input of the timeseries
X_test = np.random.rand(5)
X_test = X_test.reshape(5, 1, 1)

# get the activations for the output lstm layer
pprint(getOutputLayer(0, model, X_test))

使用LSTM层获得以下激活值:

[array([[-0.04106992, -0.00327154, -0.01524276,  0.0055838 ,  0.00969929,
        -0.01438944,  0.00211149, -0.04286387, -0.01102304,  0.0113989 ],
       [-0.05771339, -0.00425535, -0.02032563,  0.00751972,  0.01377549,
        -0.02027745,  0.00268653, -0.06011265, -0.01602218,  0.01571197],
       [-0.03069103, -0.00267129, -0.01183739,  0.00434298,  0.00710012,
        -0.01082268,  0.00175544, -0.0318702 , -0.00820942,  0.00871707],
       [-0.02062054, -0.00209525, -0.00834482,  0.00310852,  0.0045242 ,
        -0.00741894,  0.00141046, -0.02104726, -0.0056723 ,  0.00611038],
       [-0.05246543, -0.0039417 , -0.01877101,  0.00691551,  0.01250046,
        -0.01839472,  0.00250443, -0.05472757, -0.01437504,  0.01434854]],
      dtype=float32)]

因此,对于每个输入值,我得到10个值,因为我在Keras模型中指定使用带有10个神经元的LSTM。但是,哪个是一个单元,哪个是输入门,一个是输出门,哪个是忘记门?

1 个答案:

答案 0 :(得分:0)

好吧,这些是输出值,用于获取并查看每个门的值,并查看此issue

我在这里粘贴了必不可少的部分

for i in range(epochs):
    print('Epoch', i, '/', epochs)
    model.fit(cos,
              expected_output,
              batch_size=batch_size,
              verbose=1,
              nb_epoch=1,
              shuffle=False)

    for layer in model.layers:
        if 'LSTM' in str(layer):
            print('states[0] = {}'.format(K.get_value(layer.states[0])))
            print('states[1] = {}'.format(K.get_value(layer.states[1])))

            print('Input')
            print('b_i = {}'.format(K.get_value(layer.b_i)))
            print('W_i = {}'.format(K.get_value(layer.W_i)))
            print('U_i = {}'.format(K.get_value(layer.U_i)))

            print('Forget')
            print('b_f = {}'.format(K.get_value(layer.b_f)))
            print('W_f = {}'.format(K.get_value(layer.W_f)))
            print('U_f = {}'.format(K.get_value(layer.U_f)))

            print('Cell')
            print('b_c = {}'.format(K.get_value(layer.b_c)))
            print('W_c = {}'.format(K.get_value(layer.W_c)))
            print('U_c = {}'.format(K.get_value(layer.U_c)))

            print('Output')
            print('b_o = {}'.format(K.get_value(layer.b_o)))
            print('W_o = {}'.format(K.get_value(layer.W_o)))
            print('U_o = {}'.format(K.get_value(layer.U_o)))

    # output of the first batch value of the batch after the first fit().
    first_batch_element = np.expand_dims(cos[0], axis=1)  # (1, 1) to (1, 1, 1)
    print('output = {}'.format(get_LSTM_output([first_batch_element])[0].flatten()))

    model.reset_states()

print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)

print('Ploting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()