我对Gensim还是很陌生,我正尝试使用word2vec模型训练我的第一个模型。我看到所有参数都非常简单易懂,但是我不知道如何跟踪模型的丢失情况以查看进度。另外,我希望能够在每个时期之后获得嵌入,以便我还可以显示,每个时期之后的预测也将获得更多的逻辑。我该怎么办?
OR,是否最好每次训练 iter = 1 并保存每个时期后的损失和嵌入?听起来不太有效。
要显示的代码很少,但仍将其张贴在下面:
.navbar-collapse.collapse.collapse {
overflow-x: hidden !important;
overflow-y: auto !important;
}
答案 0 :(得分:4)
from gensim.models.callbacks import CallbackAny2Vec
class MonitorCallback(CallbackAny2Vec):
def __init__(self, test_words):
self._test_words = test_words
def on_epoch_end(self, model):
print("Model loss:", model.get_latest_training_loss()) # print loss
for word in self._test_words: # show wv logic changes
print(model.wv.most_similar(word))
"""
prepare datasets etc.
...
...
"""
monitor = MonitorCallback(["word", "I", "less"]) # monitor with demo words
model = Word2Vec(sentences = trainset,
iter = 5, # epoch
min_count = 10,
size = 150,
workers = 4,
sg = 1,
hs = 1,
negative = 0,
window = 9999,
callbacks=[monitor])
允许我们将callbacks用于此类目的。
示例:
get_latest_training_loss
logging
和id
-可能是不正确的(运气不好,现在github掉了,无法检查)。我已经测试了此代码,并且损失增加了-看起来很奇怪。var data = [{ id: "1" }, { id: "2" }, { id: "3" }],
unwanted = ["1", "8"],
filtered = data.filter(({ id }) => !unwanted.includes(id));
console.log(filtered);
-gensim为此issues。