我采用了Text数据集来预测评论的情绪是正面还是负面。通过使用TFIDF,我已将单词转换为向量。接下来,我已加载“手套嵌入”预训练文件。现在如何使用TFIDF和手套词嵌入来创建嵌入矩阵?我想在循环神经网络中使用嵌入矩阵。
创建嵌入矩阵时遇到索引错误,如果我在编码部分做错了什么,请纠正我。
**TFIDF Vectorizer**
''' from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer_1 = TfidfVectorizer( max_features=10000,sublinear_tf=True,
use_idf=True,stop_words='english')
X_vt = vectorizer_1.fit_transform(X_train)
X_vt.shape
(426340, 10000)'''
**Glove Embedding**
''' embedding_index = {}
f = open(os.path.join(' ',
'C:/Users/User/glove.6B/glove.6B.100d.txt'),encoding="utf-8")
for line in f:
values=line.split()
word = values[0]
coefs = np.asarray(values[1:])
embedding_index[word] = coefs
f.close() '''
enter code here
''' emdedding_matrix = zeros((vocab_size,100))
for feature, names in vectorizer_1.get_feature_items():
embedding_vector = embedding_index.get(feature)
if embedding_vector is not None:
emdedding_matrix[names] = embedding_vector '''