Question

我正在尝试用Python编写代码。我已经找到了能够满足我的问题的答案，但答案似乎都比我需要的更多。我正在尝试打开一个文本文件，并列出出现的每个唯一单词。我最终会为每个单词出现的次数添加一个计数器，但我还没有，并且只是要求使用单词列表的帮助。当我尝试调用该函数时，我收到错误'builtins.NameError：name'filename'未定义。因此，我甚至无法看到代码是否有效。非常感谢任何帮助。

    def computeWordFrequencies(filename):
        f = open ('filename.txt','r') # Opens the file as read
        line = f.readlines() # Reads the file
        L[0] = [] # Lists the unique words that occur in the file
        L[1] = [] # Upon completion, this variable will count 
        #the number of appearances of each word
        for line in f:
        L[0].append(line.split())
        L[0] = uniqueExtend(L[0])
    return(L[0])

Answer 1

如果您只想要唯一的单词，则以下内容将起作用：

set( open('filename.txt').read().split() )

这将创建文件中所有单词的列表（open('filename.txt').read().split()）。然后它从此创建一个集合（set( ... )）。集合类似于列表，但只保存每个项目中的一个，因此这样做会自动使所有条目都是唯一的。

请注意，这不会考虑标点符号，大小写等。

Answer 2

我想你想写这样的东西：

def computeWordFrequencies(filename):
    f = open (filename,'r') # Opens the file as read
    line = f.readlines() # Reads the file
    L[0] = [] # Lists the unique words that occur in the file
    L[1] = [] # Upon completion, this variable will count 
    #the number of appearances of each word
    for line in lines:
        L[0].append(line)
        L[0] = uniqueExtend(L[0])
    return(L[0])

Answer 3

from collections import Counter

def computeWordFrequencies(filename):
    with open(filename) as f:
        words = [word for line in f for word in line.split()]
    words_count = Counter(words)
    unique_words = words_count.keys()
    return unique_words

computeWordFrequencies('filename.txt')

名称错误的问题是因为我假设您使用您未定义的变量filename调用该函数。

计数器是你计算频率的朋友。它需要一个列表并返回一个单词dict作为键并计为值，因此Counter（['a'，'a'，'b']）将返回{'a'：2，'b'：1}

with open语法被称为上下文管理器，您应该更好地使用它，因为它会关闭您在代码中没有执行的文件。

我不确定你想如何使用频率计数，但基本上你现在有一个工作的例子，所以你可以自己决定。

确保在运行此python代码时，文件名与代码位于同一目录中。

如果您需要更多帮助，请与我联系。

将文本文件更改为列表

3 个答案: