我试图遍历文件夹中的所有文件,并复制其中所有Excel文件中的内容。但是,在合并到单个字符串中时,在循环功能中拆分完整路径时遇到问题。
我需要为每个路径或文件做一个子循环。
print('\nTopic id, number of documents, list of documents with probability and represented topic words: ')
dic_topic_doc = {}
# for doc in doc_clean:
for index, doc in enumerate(doc_clean):
bow = dictionary.doc2bow(doc)
# get topic distribution of the ldamodel
t = ldamodel.get_document_topics(bow)
# sort the probability value in descending order to extract the top
# contributing topic id
sorted_t = sorted(t, key=lambda x: x[1], reverse=True)
# print only the filename
arr = []
r = filenames[index], sorted_t
arr += [r]
# print(filenames[index], sorted_t)
text_file = open("text_file1.txt", "w")
for item in arr:
text_file.write("%s\n" % str(item))
text_file.close()
# get the top scoring item
top_item = sorted_t.pop(0)
# create dictionary and keep key as topic id and filename
# and probability in tuple as value
dic_topic_doc.setdefault(top_item[0], []).append((filenames[index], top_item[1]))