用递归神经网络建立语言模型

时间:2016-08-27 22:42:09

标签: python-2.7 recurrent-neural-network

运行代码时出现此错误。

异常:输入数组应与目标数组具有相同数量的样本。找到12196个输入样本和1个目标样本。

以下是我训练的模型。

from keras.models import Sequential
from keras.layers.core import Dense
from keras.utils import np_utils
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
from keras.regularizers import l2
from keras.layers.wrappers import TimeDistributed

n_in = x_train.shape[1]
n_hidden = 100
n_out = word_vecs.shape[0]
number_of_epochs = 10
batch_size = 35

model = Sequential()

model.add(Embedding(output_dim=word_vecs.shape[1],                 input_dim=word_vecs.shape[0],input_length=n_in,  weights=[word_vecs],  mask_zero=True))  

model.add(LSTM(n_hidden, W_regularizer=l2(0.0001), U_regularizer=l2(0.0001), return_sequences=True))

model.add(TimeDistributed(Dense(n_out, activation='softmax', W_regularizer=l2(0.0001))))


model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

我还编码了我的火车数据的一个热矢量。

下面是代码

new_instance = []

for instance in train_y :
    new_vector = np.zeros(shape=(instance.shape[0],  word_vecs.shape[0]))

    print(instance.shape[0],  word_vecs.shape[0])

    new_vector[np.arange(new_vector.shape[0]), instance ] =1

new_instance.append(new_vector)
new_instance = np.array(new_instance)

这是我的一个热矢量

的输出
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)

[[[ 1.  0.  0. ...,  0.  0.  0.]
  [ 1.  0.  0. ...,  0.  0.  0.]
  [ 1.  0.  0. ...,  0.  0.  0.]
  ..., 
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  1. ...,  0.  0.  0.]]]

最后

for epoch in range(number_of_epochs):    
        start_time = time.time()

        #Train for 1 epoch
        model.fit(train_x, new_instance, nb_epoch=1,  batch_size=batch_size, verbose=False, shuffle=True)   

        print("%.2f sec for training" % (time.time() - start_time))
        sys.stdout.flush()

我是新手,请原谅我。谢谢

1 个答案:

答案 0 :(得分:0)

过了一段时间后,我发现问题是一个热矢量编码代码中的错误缩进。此外,我减小了数据集大小的大小,使其更快地遵守。

以下是更正后的代码

new_instance = []

for instance in train_y :
    new_vector = np.zeros(shape=(instance.shape[0],  word_vecs.shape[0]))

    print(instance.shape[0],  word_vecs.shape[0])

    new_vector[np.arange(new_vector.shape[0]), instance ] =1

    new_instance.append(new_vector)
new_instance = np.array(new_instance)