使用Keras进行整数系列预测

时间:2018-07-25 16:48:42

标签: python machine-learning keras

我正在尝试编写一个RNN模型,该模型将预测整数序列中的下一个数字。模型损失在每个时期都会变小,但是预测永远不会变得非常准确。我已经尝试了许多训练集大小和时期数,但我的预测值始终与预期值相差几位数。您能否给我一些提示,以改善或我做错了什么?这是代码:

from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
from keras import metrics
import numpy as np

training_length = 10000
rnn_size = 512
hm_epochs = 30

def generate_sequence(length=10):
    step = np.random.randint(0,50)
    first_element = np.random.randint(0,10)
    first_element = 0
    l_ist = [(first_element + (step*i)) for i in range(length)]
    return l_ist

training_set = []

for _ in range(training_length):
    training_set.append(generate_sequence(10))

feature_set = [i[:-1] for i in training_set]

label_set = [i[-1:] for i in training_set]

X = np.reshape(feature_set,(training_length, 9, 1))
y = np.array(label_set)


model = Sequential()
model.add(LSTM(rnn_size, input_shape = (X.shape[1], X.shape[2]), return_sequences = True))
model.add(Dropout(0.2))
model.add(LSTM(rnn_size))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='linear'))
model.compile(loss='mse', optimizer='rmsprop', metrics=['accuracy'])

filepath="checkpoint_folder/weights-improvement.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

model.fit(X,y,epochs=hm_epochs, callbacks=callbacks_list)

效果:

30个纪元后(亏损:66.39):

1顺序:[0,20,40,60,80,100,120,140,160]预期:[180] ||得到了:[181.86118]

2顺序:[0,11,22,33,44,55,66,77,88]预期:[99] ||得到了:[102.17369]

3顺序:[0,47,94,141,188,235,282,329,376]预计:[423] ||得到了:[419.1763]

4顺序:[0,47,94,141,188,235,282,329,376]预计:[423] ||得到了:[419.1763]

5序列:[0,4,8,12,16,20,24,28,32]预期:[36] ||得到了:[37.506496]

6顺序:[0,48,96,144,192,240,288,336,384]预期:[432] ||得到了:[425.0569]

7序列:[0,28,56,84,112,140,168,196,224]预计:[252] ||得到了:[253.60233]

8序列:[0,18,36,54,72,90,108,126,144]预期:[162] ||得到了:[163.538]

9序列:[0,19,38,57,76,95,114,133,152]预期:[171] ||得到了:[173.77933]

10序列:[0,1,2,3,4,5,6,7,8]预期:[9] ||得到了:[9.577981]

...

100个周期后(亏损:54.81):

1顺序:[0,20,40,60,80,100,120,140,160]预期值:[180] ||得到了:[181.03535]

2顺序:[0,11,22,33,44,55,66,77,88]预期值:[99] ||得到了:[99.15022]

3序列:[0,47,94,141,188,235,282,329,376]预期值:[423] ||得到了:[423.7969]

4序列:[0,47,94,141,188,235,282,329,376]预期值:[423] ||得到了:[423.7969]

5序列:[0,4,8,12,16,20,24,28,32]预期值:[36] ||得到了:[34.47298]

6序列:[0,48,96,144,192,240,288,336,384]预期值:[432] ||得到了:[432.73163]

7序列:[0,28,56,84,112,140,168,196,224]预期值:[252] ||得到了:[251.55792]

8序列:[0,18,36,54,72,90,108,126,144]预期值:[162] ||得到了:[164.81227]

9序列:[0,19,38,57,76,95,114,133,152]预期值:[171] ||得到了:[172.6425]

10序列:[0,1,2,3,4,5,6,7,8]预期值:[9] ||得到了:[8.837313]

2 个答案:

答案 0 :(得分:0)

您是否尝试了更长的顺序?不需要LSTM,因为依赖性不是很长。您可以尝试使用RNN的另一个变体。

答案 1 :(得分:0)

关于您的示例,您的输入序列仅为(x 2x 3x)等。 对于递归神经网络这不是问题。您想学习一种计算策略,而不是功能的长期依赖。 RNN非常强大,可以找到非常复杂的模式,但为此,它们不是正确的工具。

要解决此问题,您可以看一下进化算法。