我不知道为什么我的word2vec没有受过训练

时间:2019-08-13 09:23:25

标签: python keras word2vec

我试图用Keras生成skip-gram word2vec。 我认为没有逻辑错误... 但是损失功能评分不会降低。

=====================================================================
58693/58693 [==============================] - 342s 6ms/step - loss: 0.6937

Epoch 2/100

58693/58693 [==============================] - 337s 6ms/step - loss: 0.6935

Epoch 3/100

58693/58693 [==============================] - 336s 6ms/step - loss: 0.6934

=====================================================================

我基于此页面。 (https://adventuresinmachinelearning.com/word2vec-keras-tutorial/

该页面提供了一些功能,例如下载数据,读取数据和收集数据。

sampling_table = sequence.make_sampling_table(vocab_size)
# print (sampling_table)
couples, labels = skipgrams(data, vocab_size,     window_size=window_size, sampling_table=sampling_table)
word_target, word_context = zip(*couples)
word_target = np.array(word_target, dtype="int32")
word_context = np.array(word_context, dtype="int32")
labels = np.array(labels, dtype='int32')



def trainingDataGen(batch_size):
    while True:
        idx = np.random.randint(0, len(labels) - 1,batch_size)

        x_target = word_target[idx]
        x_context = word_context[idx]
        y_label = labels[idx]

        yield [x_target, x_context], y_label


input_target = keras.layers.Input(shape=(1,), name='targetWord')
input_context = keras.layers.Input(shape=(1,), name='contextWord')

embedding = Embedding(vocab_size, vector_dim, input_shape=(1,),trainable=True,name='embedding')
embedded_target = embedding(input_target)
embedded_context = embedding(input_context)
dot_product = keras.layers.Dot(axes=-1)([embedded_target, embedded_context])
flattened = keras.layers.Flatten()(dot_product)
activated = keras.layers.Dense(1,  activation=tf.nn.sigmoid)(flattened)
model = keras.Model([input_target, input_context], activated)
model.summary()
model.compile(optimizer='sgd',loss='binary_crossentropy')

steps_per_epoch = divmod(len(labels), 512)[0]+1
print (steps_per_epoch)
model.fit_generator(trainingDataGen(512),steps_per_epoch=steps_per_epoch, epochs=100)

0 个答案:

没有答案