Question

即时编写报告，我需要计算文本文件的唯一字词。

我的文字在D：\ shakeall中，它们共有42个文件......

我对Python有所了解，但我现在不知道该怎么做。

这就是我所知道它是如何运作的。

读取目录
从文本中填写单词列表
计算总数/唯一字数

我所知道的就是这个。还有一些关于for，while，列表和索引，变量，列表......

我想做的是制作我自己的函数库并使用它来获得结果。

我真的很感激有关我的问题的任何建议。

------ P.S。

我对Python几乎一无所知。我只能做一个简单的数学或打印单词列表。给我的主题太难了。遗憾。

Answer 1

textfile=open('somefile.txt','r')
text_list=[line.split(' ') for line in textfile]
unique_words=[word for word in text_list if word not in unique_words]
print(len(unique_words))

这是它的一般要点

Answer 2

import os
uniquewords = set([])

for root, dirs, files in os.walk("D:\\shakeall"):
    for name in files:
        [uniquewords.add(x) for x in open(os.path.join(root,name)).read().split()]

print list(uniquewords)
print len(uniquewords)

如何使用Python计算特定目录中文本文件的唯一单词？

2 个答案: