Question

我使用了LSTM autoencoder这个代码here。

我的主要目标是从此模型中提取主题。但是，我不知道哪一层适合这样做。我现在所做的是提取对我来说没有意义的600个特征（60 sequence length, 10 topics）。

如果我说我想10 topics，那么我想为每个主题看到10 topics with 10 distribution of data。

有什么想法，我该如何实施和进行？

到目前为止，我已经实现了以下代码：

from keras.models import load_model

    def get_topics_strength(model, vocab, topn=10):
        topics = []
        weights = model.get_weights()[0]
        for idx in range(model.output_shape[1]):
            token_idx = np.argsort(weights[:, idx])[::-1][:topn]
            topics.append([(vocab[x], weights[x, idx]) for x in token_idx])

        return topics

    def revdict(d):
        return dict((v, k) for (k, v) in d.items())

    ae_lstm = load_model('./Data/simple_ae_to_compare.hdf5')
    topics_strength,ll = get_topics_strength(ae_lstm, revdict(word_freqs), topn=10)

此代码显示了600个功能，因为60是序列长度，我希望有10个主题。

您能指出我错过了哪一部分吗？

谢谢。

从LSTM自动编码器模型中提取主题

0 个答案: