我要进行分类,输入的是4维张量。我没有将单词向量输入LSTM单元进行分类,而是尝试输入整个句子,以便它可以学习句子之间的联系。输入示例:
input = ['this is first sentence',
'this is second sentence']
众所周知,文本通常会转换为单词向量。单词向量的示例如下(假设嵌入大小为2):
word_embedding = [
[ [0.21, 0.43], [0.55, 0.87], [0.73, 0.51], [0.64, 0.98] ],
[ [0.21, 0.43], [0.55, 0.87], [0.12, 0.29], [0.64, 0.98] ]
]
现在,像上面那样用词向量表示的句子集合是一个3维张量,但是如果我有多个句子集合,它将是4维。
为了能够提供整个句子,我必须将维度减少到3维,这意味着一个句子应该表示为单个向量,而不是向量的集合。首先,我是通过平均句子中的单词向量来实现的。示例(使用上面的word_embedding):
mean_word_embedding = [
[0.5325, 0.6975],
[0.38 , 0.6425]
]
但是这种方法的准确性很差。我想尝试的另一种方法是通过LSTM输入句子中的每个单词向量,并使用最后的输出作为句子的表示向量。我只是找不到办法。这是我的代码的摘要(某些函数的代码不相关,因此未显示):
# Create a 3-dimension tensor, with
# the 1st dimension being the number of samples,
# the 2nd being the number of sentences in a sample, and
# the 3rd being the number of words in a sample
input = tf.placeholder(tf.int32, shape=[None, sentences_length, words_length])
# Convert each word into a vector with size embedding_size.
# A this point, the input tensor has been turned into a 4-dimension tensor,
# with the 4th dimension being the embedding size
# (based on the examples above, embedding size is 2)
word_embedding = convert_word_embedding(input, embedding_size)
# Reduce the 4-dimension tensor into 3-dimension tensor
# so each sentence is only represented by a single vector
mean_word_embedding = tf.reduce_mean(word_embedding, axis=1) # This is the average approach
sentence_embedding = recurrent_words(word_embedding, sentences_length, words_length, embedding_size) # This is the alternate approach
我在 recurrent_words()中做什么?
任何帮助将不胜感激。谢谢。