Question

我想创建一个嵌入预训练网络的单词，在word2vec CBOW之上添加一些内容。因此，我首先尝试实现word2vec CBOW。由于我对keras很新，我无法弄清楚如何在其中实现CBOW。

初始化：

我已计算出词汇量，并将单词映射为整数。

输入（尚未实施）的网络：

2*k + 1整数列表（表示上下文中的中心词和2*k个字词）

网络规范

共享Embedding图层应该采用这个整数列表并给出相应的矢量输出。进一步采用2*k上下文向量的平均值（我相信这可以使用add_node(layer, name, inputs=[2*k vectors], merge_mode='ave')完成）。

如果任何人都可以共享一个小代码片段，那将非常有用。

P.S。：我正在查看word2veckeras，但无法关注其代码，因为它也使用了gensim。

更新1 ：

我想在网络中共享嵌入层。嵌入层应该能够采用上下文字（2 * k）和当前字。我可以通过在输入中获取所有2 * k + 1个单词索引来执行此操作，并编写一个自定义lambda函数来完成所需的操作。但是，在那之后我还想添加负采样网络，我将不得不在上下文向量中嵌入更多的单词和点积。有人可以提供一个示例，其中嵌入层是Graph()网络

中的共享节点

Answer 1

Graph()已弃用keras

可以使用keras functional API创建任意网络。以下是演示代码，该代码创建了一个word2vec cbow模型，在随机输入上测试了负抽样

from keras import backend as K
import numpy as np
from keras.utils.np_utils import accuracy
from keras.models import Sequential, Model
from keras.layers import Input, Lambda, Dense, merge
from keras.layers.embeddings import Embedding

k = 3 # context windows size
context_size = 2*k
neg = 5 # number of negative samples
# generate weight matrix for embeddings
embedding = []
for i in range(10):
    embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding

# Creating CBOW model
word_index = Input(shape=(1,))
context = Input(shape=(context_size,))
negative_samples = Input(shape=(neg,))
shared_embedding_layer = Embedding(input_dim=10, output_dim=100, weights=[embedding])

word_embedding = shared_embedding_layer(word_index)
context_embeddings = shared_embedding_layer(context)
negative_words_embedding = shared_embedding_layer(negative_samples)
cbow = Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,))(context_embeddings)

word_context_product = merge([word_embedding, cbow], mode='dot')
negative_context_product = merge([negative_words_embedding, cbow], mode='dot', concat_axis=-1)

model = Model(input=[word_index, context, negative_samples], output=[word_context_product, negative_context_product])

model.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])

input_context = np.random.randint(10, size=(1, context_size))
input_word = np.random.randint(10, size=(1,))
input_negative = np.random.randint(10, size=(1, neg))

print "word, context, negative samples"
print input_word.shape, input_word
print input_context.shape, input_context
print input_negative.shape, input_negative

output_dot_product, output_negative_product = model.predict([input_word, input_context, input_negative])
print "word cbow dot product"
print output_dot_product.shape, output_dot_product
print "cbow negative dot product"
print output_negative_product.shape, output_negative_product

希望它有所帮助！

更新1：

我已完成代码并上传here

Answer 2

你可以尝试这样的事情。在这里，我将嵌入矩阵初始化为固定值。对于形状为(1, 6)的输入数组，您将获得形状(1, 100)的输出，其中100是6输入嵌入的平均值。

model = Sequential()
k = 3 # context windows size
context_size = 2*k
# generate weight matrix for embeddings
embedding = []
for i in range(10):
    embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding

model.add(Embedding(input_dim=10, output_dim=100, input_length=context_size, weights=[embedding]))
model.add(Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,)))

model.compile('rmsprop', 'mse')

input_array = np.random.randint(10, size=(1, context_size))
print input_array.shape

output_array = model.predict(input_array)
print output_array.shape
print output_array[0]

如何在共享嵌入层和负采样的keras中实现word2vec CBOW？

2 个答案:

更新1：