将生成器表达式(<genexpr>)转换为列表?

时间:2016-07-22 21:02:01

标签: python generator

我正在做一些主题建模,我希望存储一些分析结果。

import pandas as pd, numpy as np, scipy
import sklearn.feature_extraction.text as text
from sklearn import decomposition

descs = ["You should not go there", "We may go home later", "Why should we do your chores", "What should we do"]

vectorizer = text.CountVectorizer()

dtm = vectorizer.fit_transform(descs).toarray()

vocab = np.array(vectorizer.get_feature_names())

nmf = decomposition.NMF(3, random_state = 1)

topic = nmf.fit_transform(dtm)

topic_words = []

for t in nmf.components_:
    word_idx = np.argsort(t)[::-1][:20]
    topic_words.append(vocab[i] for i in word_idx)

for t in range(len(topic_words)):
    print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))

打印:

Topic 0: do we should your why chores what you there not may later home go

Topic 1: should you there not go what do we your why may later home chores

Topic 2: we may later home go what do should your you why there not chores

我正在尝试将这些主题写入文件,因此我认为将它们存储在列表中可能会起作用,如下所示:

l = []
for t in range(len(topic_words)):
    l.append([word for word in topic_words[t]])
    print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))

l最终只是一个空数组。如何将这些单词存储在列表中?

1 个答案:

答案 0 :(得分:3)

您将生成器表达式附加到列表topic_words,因此第一次打印时,生成器表达式已经用尽。你可以这样做:

topic_words = []

for t in nmf.components_:
    word_idx = np.argsort(t)[::-1][:20]
    topic_words.append([vocab[i] for i in word_idx])
#                      ^                          ^

有了这个,你显然不需要一个新的清单,你可以打印出来:

for t, words in enumerate(topic_words, 1):
    print("Topic {}: {}\n".format(t, " ".join(words)))