我是ML的菜鸟,尝试编写一个LSTM模型,该模型将处理一批序列并检测以下简单模式:如果序列以奇数开头,则目标为0,否则为1:
数据:
[[[ 1 2 3]
[ 2 3 4]
[ 3 4 5]
[ 4 5 6]
[ 5 6 7]] #starts with 1 -> 0
[[ 6 7 8]
[ 7 8 9]
[ 8 9 10]
[ 9 10 11]
[10 11 12]] #starts with 6 -> 1
[[11 12 13]
[12 13 14]
[13 14 15]
[14 15 16]
[15 16 17]]] #starts with 11 -> 0
目标:
[0 1 0]
代码:
import numpy as np
import pandas as pd
from keras import callbacks
from keras import optimizers
from keras.layers import LSTM, Dense, Flatten, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.utils import shuffle
def demo():
scaler = StandardScaler()
dummy_data = pd.DataFrame(data=[[x, x+1, x+2, int((x - 1) / 5 % 2)] for x in range(1, 1001)],
columns=['a', 'b', 'c', 'target'])
dummy_data[['a', 'b', 'c']] = scaler.fit_transform(dummy_data[['a', 'b', 'c']])
data = dummy_data.loc[:, dummy_data.columns != 'target']
target = dummy_data['target']
data = np.array(np.split(data.values, 200))
target = np.array(np.split(target.values, 200))
data, target = shuffle(data, target)
target = np.array(list(map(lambda x: x[0],target)))
print(data[:3,:],target[:3])
x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.25, random_state=4)
opt = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)
# build the model
model = Sequential()
num_features = data.shape[2]
num_samples = data.shape[1]
first_lstm = LSTM(32, batch_input_shape=(None, num_samples, num_features), return_sequences=True, activation='tanh')
model.add(
first_lstm)
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(LSTM(16, return_sequences=True, activation='tanh'))
model.add(Dropout(0.2))
model.add(LeakyReLU())
model.add(LSTM(8, return_sequences=True, activation='tanh'))
model.add(LeakyReLU())
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy'])
model.summary()
tb = callbacks.TensorBoard(log_dir='./logs/', histogram_freq=10,
batch_size=128,
write_graph=True, write_grads=True, write_images=False,
embeddings_freq=0, embeddings_layer_names=None, embeddings_metadata=None)
loss_checkpoint = callbacks.ModelCheckpoint('./best_loss.hdf5', monitor='val_loss', verbose=1, save_best_only=True,
mode='min')
model.fit(x_train, y_train, batch_size=128, epochs=5000, validation_data=(x_test, y_test),
callbacks=[tb, loss_checkpoint])
demo()
我期望网络学习这种简单的模式,但是它失败了,请参见下面的损失:
为了使网络更好地运行,可以改进什么?
答案 0 :(得分:0)
根据您的评论,我建议更改数据集。尝试类似的东西:
数据:
[
[1, 3, 5],
[2, 4, 6],
[3, 5, 7]
]
目标:[1, 0, 1]
您应该尝试在序列中使用更明显模式的数据集。从理论上讲,LSTM应该对此类样品表现更好。