Question

我有一个文字文件，从a到z列出了千字。它看起来像这样的例子：

a
aaoo
aloor
azur
black
blue
church
croccoli
dark
den
...
zic
zip

我需要构建我的字典，其键是小写字母，其值是包含的单词集给定的信件。例如：

myDict={'a':['aaoo','aloor','azur'], 'b':['black','blue'], 'c': ['church', 'croccoli'],'d':['dark','den'], and so on}

然后我需要提示用户输入一个单词并打印包含该单词所有字符的文件中的所有单词。

wordFind=("Enter word: ")
wordFind=wordFind.lower()
wordFind=set(wordFind) #I convert to set to use intersection

例如，我输入一个单词“abc”，然后wordFind中的'a'，'b'，'c'将与myDict中的'key'相交，结果打印出来将包含键中的所有值myDict的'a'，'b'，'c'。

编辑：这里的“交集”是指2套（wordFind和myDict）之间的交集 - 希望这个足够清楚..

到目前为止，我的代码是：

n=open("a7.txt","r")

line=n.readlines()

myDict={}
def myDict():
    for word in line:
        word=word.strip().lower()
        if word in myDict:
            myDict[word[0]].append(word)
        else:
            myDict[word[0]]=word


wordFind=("Enter word: ")
wordFind=wordFind.lower()
wordFind=set(wordFind)

# I get stuck at this second part, which requires me to use intersection
temp={}
for word in wordFind:
    temp= word.intersection(myDict)
    print(temp)

    n.close()

但我收到了这个错误：

Traceback (most recent call last):
  File "/Users/annie_mabu/Documents/CS/bt2.py", line 21, in <module>
    temp= word.intersection(myDict)
AttributeError: 'str' object has no attribute 'intersection'

任何人都可以告诉我我在哪里犯了错误以及如何解决它？

Answer 1

只需阅读每个第一个字母的密钥列表：

with open(ur_file) as f:
    d={}
    for word in f:
        d.setdefault(word[0].lower(), []).append(word)

然后你有一个像这样的字典：

>>> d
{'a': ['a', 'aaoo', 'aloor', 'azur'], 'c': ['church', 'croccoli'], 'b':  ['black', 'blue'], 'd': ['dark', 'den'], 'z': ['zic', 'zip']}

然后你可以编写一个简单的函数来找到你的单词：

>>> def f(s): return s in d[s[0]]
... 
>>> f('art')
False
>>> f('aaoo')
True

或者，如果您知道文件中包含全部26个字母，则可以将所有26个字母设置为空列表：

d={k:list() for k in 'abcdefghijklmnopqrstuvwxyz'}
with open(ur_file) as f:
    for word in f:
        d[word[0].lower()].append(word)

通过＆＃39; .intersection＆＃39;你可能会想到sets：

>>> set(['a', 'aaoo', 'aloor', 'azur']).intersection(set(['art']))
set([])
>>> set(['a', 'aaoo', 'aloor', 'azur']).intersection(set(['aaoo']))
set(['aaoo'])

但是，无论您是否有列表，字典，集合，字符串 - in关键字最适合测试单个元素的成员资格：

>>> 'art' in set(['a', 'aaoo', 'aloor', 'azur'])
False
>>> 'azur' in set(['a', 'aaoo', 'aloor', 'azur'])
True

Answer 2

我需要将该文本文件转换为字典，例如：myDict = {＆＃39; a＆＃39;：＆＃39;所有单词都以＆＃39;开头，＆＃39; b＆＃39 ;：所有单词都以b＆＃39;开头，依此类推}

我从这个功能开始：

def make_dict():
    d = {}
    with open("a7.txt","r") as wordfile:
        for word in wordfile:
            word = word.strip().lower()
            first = word[0]
            if first not in d: d[first] = []
            d[first].append(word)
    return d

myDict = make_dict()

然后我使用wordFind =（＆＃34;输入单词：＆＃34;），当我输入与myDict中的键值相交的wordFind时，结果将给出该交集中的所有值的

如果您只想要一个以与您输入的单词相同的字母开头的单词列表，那么类似于：

wordFind = raw_input("Enter word: ")   # raw_input for Python2, input for Python3
wordFind = wordFind.lower()
find_first = wordFind[0]
matches = myDict[find_first]
print(matches)

应该工作。

如果你想要更广泛的匹配，比如匹配的单词需要以你输入的同一组字符开头，那么就像：

wordFind = raw_input("Enter word: ")   # raw_input for Python2, input for Python3
wordFind = wordFind.lower()
find_first = wordFind[0]
matches = [w for w myDict[find_first] if w.startswith(wordFind)]  # This is different
print(matches)

应该工作。

每次评论

修改：

对于输入＆＃34; abc＆＃34;，如果你想要一个以＆＃34开头的所有单词的列表; a＆＃34;，＆＃34; b＆＃34;或＆＃34; c＆＃34;，那么类似下面的内容应该有效：

wordFind = raw_input("Enter word: ")   # raw_input for Python2, input for Python3
wordFind = wordFind.lower()
matches = []
for c in wordFind: matches.extend(myDict[c])
print(matches)

如果您需要单独使用它们，而不是在单个列表中，则可以执行以下操作：

wordFind = raw_input("Enter word: ")   # raw_input for Python2, input for Python3
wordFind = wordFind.lower()
matches = {}
for c in wordFind: matches[c] = myDict[c]
print(matches)

Answer 3

您可能希望扩展您的意思＆＃34; 交叉点＆＃34;，但请尝试以下操作：

words = [w.strip().lower() for w in open("a7.txt").read().splitlines()]

word_dict = {}
for word in words:
    if word[0] in word_dict:
        word_dict[word[0]].append(word)
    else:
        word_dict[word[0]] = [word]

wordFind = raw_input("Enter word: ").strip().lower()
print '\n'.join(word_dict[wordFind[0]])

Python 3：从文本文件构建一个字典，然后在dict中搜索单词

3 个答案: