理解Keras LSTM回归问题的问题:数量级错误的预测

时间:2019-01-10 18:44:25

标签: python keras neural-network regression lstm

我一直在尝试对keras进行一些实验,但在看似简单的回归任务上却遇到了问题。我想使用LSTM预测噪声信号值。这是我的代码。我开始创建一个以2048 Hz采样的正弦波给出的噪声信号

frequency = 2048
time_interval = 10
x_mock = np.arange(0,time_interval, 1/frequency)
rand_noise = np.array([np.random.randn() for i in range(len(x_mock))])
sine_wit = np.sin(10*x_mock)
target_mock = np.sin(x_mock)+sine_wit*rand_noise 

然后,我定义了一个LSTM网络,该网络具有3个LSTM层,然后是4个全连接层。

def keras_LSTM(fs, n_witness, lr=0.000039):
    nRec1 = 64 #originally 32
    nRec2 = 16
    nRec3 = 8
    nFC1 = 128
    nFC2 = 16
    nFC3 = 8
    nFC4 = 1
    model_LSTM = Sequential()
    model_LSTM.add(LSTM(nRec1, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, 
                       kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', 
                       bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, 
                       recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, 
                       kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0,
                       recurrent_dropout=0.0, implementation=1, return_sequences=True, 
                       return_state=False, go_backwards=False, stateful=True, unroll=False, 
                       batch_input_shape=(1,fs,n_witness), input_shape=(None,n_witness)))#,
    model_LSTM.add(LSTM(nRec2, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, 
                       kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', 
                       bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, 
                       recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, 
                       kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0,
                       recurrent_dropout=0.0, implementation=1, return_sequences=True, 
                       return_state=False, go_backwards=False, stateful=True, unroll=False))
    model_LSTM.add(LSTM(nRec3, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, 
                       kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', 
                       bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, 
                       recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, 
                       kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0,
                       recurrent_dropout=0.0, implementation=1, return_sequences=True, 
                       return_state=False, go_backwards=False, stateful=True, unroll=False))
    model_LSTM.add(Dense(nFC1, kernel_initializer='glorot_normal'))
    model_LSTM.add(LeakyReLU(alpha=0.01))
    model_LSTM.add(Dense(nFC2, kernel_initializer='glorot_normal'))
    model_LSTM.add(LeakyReLU(alpha=0.01))
    model_LSTM.add(Dense(nFC3, kernel_initializer='glorot_normal'))
    model_LSTM.add(LeakyReLU(alpha=0.01))
    model_LSTM.add(Dense(nFC4, kernel_initializer='glorot_normal'))      
    model_LSTM.add(LeakyReLU(alpha=0.01))

    model_LSTM.compile(optimizer=keras.optimizers.Adam(lr=0.000039, beta_1=0.9, beta_2=0.999,
                        epsilon=1e-8, decay=0.05, amsgrad=False), 
                       loss='mean_squared_error')
    return model_LSTM

注意:我在FC层中使用了ADAM优化器和泄漏的reLU,我希望在每个连续的时间步长返回序列并继续传递。我选择batch_input_shape=(1,fs,n_witness)是为了使我每次都将2048个连续的样本传递给网络(基本上,据我了解,在计算梯度和更新权重之前,我会在1秒内对其进行训练)。然后,我对输入数据进行规范化,并对数据进行整形,以生成3-d输入和目标数组。在每个时间样本上,输入要素是sine_witrand_noise的规格化值。使用这些值,我想同时预测target_mock

rand_noise = (rand_noise-min(rand_noise))/(max(rand_noise)-min(rand_noise))
sine_wit = (sine_wit-min(sine_wit))/(max(sine_wit)-min(sine_wit))
X_mock_mat = vstack((sine_wit, rand_noise)).T

X_mock = X_mock_mat.reshape(-1,frequency,2)
Y_mock = target_mock.reshape(-1,frequency,1)

X_train = X_mock[:][:][:]
Y_train = Y_mock[:][:][:]

最后,我训练模型并根据相同的精确目标数据进行预测:

model_mock_history = model_mock_LSTM.fit(X_train, Y_train, epochs=30, batch_size=1, verbose=2, shuffle=False) 
trainPredict = model_mock_LSTM.predict(X_train, batch_size=1)

然而,结果在时域和频域上都非常奇怪,请参见图片:

Prediction in time and frequency domain

我在做什么错?我感到自己对某事深有误会。

0 个答案:

没有答案