我正在尝试为文本语料库执行Avg-Word2Vec。我为此创建了一个类:
W2V类:
final_vectors = [];
def __init__(self, review):
self.review = review
def w2v_vec(self):
for sent in self.review: # for each review/sentence
word_vec = np.zeros(50)
words_count = 0; # num of words with a valid vector in the sentence/review
for word in sent: # for each word in a review/sentence
if word in w2v_words:
vec = w2v_model.wv[word]
word_vec += vec
words_count += 1
if words_count != 0:
word_vec /= words_count
return W2V.final_vectors.append(word_vec)
正在尝试将其用作:
x_train_ = W2V(x_train_sent)
x_train_w2v = x_train_.w2v_vec()
print(len(x_train_w2v))
但是在调用类方法时,由于None
类型的对象没有len()
而导致错误结束。我确实进行了搜索,并理解该函数正在返回None
值。但是我不知道如何修改它,以便在我用拆分语句列表调用此类方法时获得文本向量输出。