一次更换许多打印件

时间:2018-08-08 15:08:23

标签: python python-3.x

我正在尝试打印脚本的输出。但是为此,我必须使用许多印刷品。有没有办法在不进行所有打印的情况下拥有所有主题?

import pandas
import mglearn
import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation

dataset = pandas.read_csv('text.csv', encoding = 'utf-8')
comments = dataset['comments']
comments_list = remove_small_words.values.tolist()

vector = CountVectorizer()
X = vector.fit_transform(comments_list)

lda = LatentDirichletAllocation(n_components = 30, learning_method = "batch", max_iter = 25, random_state = 0)

document_topics = lda.fit_transform(X)
sorting = np.argsort(lda.components_, axis = 1)[:, ::-1]
feature_names = np.array(vector.get_feature_names())

topics = mglearn.tools.print_topics(topics = range(30),   feature_names = feature_names, sorting = sorting, topics_per_chunk = 5, n_words = 10)

print(topics)

print("Topic 0:")
docs = np.argsort(document_topics[:, 0])[::-1]
for i in docs[:]:
    print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()
print("Topic 1:")
docs = np.argsort(document_topics[:, 1])[::-1]
for i in docs[:]:
    print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()
...
print("Topic 40:")
docs = np.argsort(document_topics[:, 40])[::-1]
for i in docs[:]:
   print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()

例如,我可以循环打印所有内容,而不是打印40次吗?要打印这40个主题,我需要240行代码。假设我需要打印100张... 我有这个输出,我想保留它:

  

主题0:

     

blabla

     

blabla

     

主题1:

     

blabla

     

blabla

     

主题3:

     

blabla

     

blabla

     

...

1 个答案:

答案 0 :(得分:3)

您可以使用字符串格式来确定每个主题要打印的字符串:

for i in range(topics):
    print("Topic {}:".format(i))

然后,由于您拥有i,因此可以这样添加其他语句:

for i in range(topics):
    print("Topic {}:".format(i))
    docs = np.argsort(document_topics[:, i])[::-1]
    for j in docs[:]:
       print(" ".join(comments_list[j].encode('utf-8').split(",")[:2]) + "\n")