Question

在这个程序中，我从一个纯文本文件中创建一个字典，基本上我计算一个单词在文档中出现的数量，单词成为关键字，它出现的时间是值。我可以创建字典，但后来我无法搜索字典。这是我的更新代码与您的家伙的输入。我非常感谢你的帮助。

from collections import defaultdict
import operator
def readFile(fileHandle):
    d = defaultdict(int)
    with open(fileHandle, "r") as myfile:
        for currline in myfile: 
            for word in currline.split():
                d[word] +=1
    return d

def reverseLookup(dictionary, value):
    for key in dictionary.keys():
        if dictionary[key] == value:
            return key
    return None

afile = raw_input ("What is the absolute file path: ")
print readFile (afile)

choice = raw_input ("Would you like to (1) Query Word Count (2) Print top words to a new document     (3) Exit: ") 
if (choice == "1"):
    query = raw_input ("What word would like to look up? ")
    print reverseLookup(readFile(afile), query)
if (choice == "2"):
    f = open("new.txt", "a")
    d = dict(int)
    for w in text.split():
        d[w] += 1
    f.write(d)
    file.close (f)
if (choice == "3"):
    print "The EXIT has HAPPENED"
else:
    print "Error"

Answer 1

您的方法非常复杂（语法错误，至少在您发布的代码示例中）。

此外，您正在重新绑定内置名称dict，这也是有问题的。

此外，此功能已内置于Python中：

from collections import defaultdict

def readFile(fileHandle):
    d = defaultdict(int)  # Access to undefined keys creates a entry with value 0
    with open(fileHandle, "r") as myfile:   # File will automatically be closed
        for currline in myfile:             # Loop through file line-by-line
            for word in currline.strip().split(): # Loop through words w/o CRLF
                d[word] +=1                 # Increase word counter
    return d

至于您的reverseLookup功能，请参阅ypercube的回答。

Answer 2

您的代码在查找第一个（键，值）对后返回。在返回之前，您必须搜索整个字典，但尚未找到该值。

def reverseLookup(dictionary, value):
    for key in dictionary.keys():
        if dictionary[key] == value:
            return key
    return None

您也不应该返回"error"，因为它可以是一个单词，因此也是您词典中的一个键！

Answer 3

根据您打算使用此reverseLookup()函数的方式，如果您使用两个词典，您可能会发现您的代码更加快乐：构建第一个你已经做过的字典，然后构建一个第二个字典，其中包含出现次数和多次出现的单词之间的映射。然后，您的reverseLookup()无需在每次查询上执行for k in d.keys()循环。该循环只会发生一次，之后的每一次查找都会明显加快。

我拼凑了一些（但没有测试过）一些显示我正在谈论的内容的代码。我偷了蒂姆的readFile()例程，因为我更喜欢它的外观:)但是把他的好函数 - 本地字典d移到了全局，只是为了保持函数的简洁和甜蜜。在“真正的项目”中，我可能会将整个事物包装在一个类中，以便在运行时允许任意数量的字典并提供合理的封装。这只是演示代码。：）

import operator
from collections import defaultdict

d = defaultdict(int)
numbers_dict = {}

def readFile(fileHandle):
    with open(fileHandle, "r") as myfile:
        for currline in myfile:
            for word in currline.split():
                d[word] +=1
    return d


def prepareReverse():
    for (k,v) in d.items():
        old_list = numbers_dict.get(v, [])
        new_list = old_list << k
        numbers_dict[v]=new_list

def reverseLookup(v):
    numbers_dict[v]

如果您打算进行两次或更多次查找，此代码将交换内存以获得执行速度。你只迭代字典一次（迭代所有元素不是字典的强点），但是以内存中重复数据为代价。

Answer 4

搜索无法正常工作，因为您有一个词典将单词映射到其计数，因此获取“单词”的出现次数应该只是dictionary[word]。你真的不需要reveseLookup()，dict中已经有.get(key, default_value)方法：dictionary.get(value, None)

为什么我不能搜索我创建的字典（Python）？

4 个答案: