我正在尝试在python中创建一个函数,该函数将使用字典在文本文件中打印出单词的字谜。我已经看了几百个类似的问题,因此,如果这是重复的话,我深表歉意,但是我似乎找不到适合我问题的解决方案。
我了解我需要做的事情(至少,我是这么认为的),但是我只能停留在最后一部分。
这是我到目前为止所拥有的:
with open('words.txt', 'r') as fp:
line = fp.readlines()
def make_anagram_dict(line):
dict = {}
for word in line:
key = ''.join(sorted(word.lower()))
if key in dict.keys():
dict[key].append(word.lower())
else:
dict[key] = []
dict[key].append(word.lower())
if line == key:
print(line)
make_anagram_dict(line)
我想我需要一些将每个值的键与其他值的键进行比较的东西,然后在它们匹配时进行打印,但是我无法工作。
目前,我能做的最好的就是打印出文件中的所有键和值,但理想情况下,我将能够打印出文件中的所有字谜。
输出:我没有具体的指定输出,但是类似以下内容: [cat:act,tac]
每个字谜。 再次道歉,如果这是重复的话,那么任何帮助将不胜感激。
答案 0 :(得分:3)
我不确定输出格式。在我的实现中,所有字谜都最后打印出来。
with open('words.txt', 'r') as fp:
line = fp.readlines()
def make_anagram_dict(line):
d = {} # avoid using 'dict' as variable name
for word in line:
word = word.lower() # call lower() only once
key = ''.join(sorted(word))
if key in d: # no need to call keys()
d[key].append(word)
else:
d[key] = [word] # you can initialize list with the initial value
return d # just return the mapping to process it later
if __name__ == '__main__':
d = make_anagram_dict(line)
for words in d.values():
if len(words) > 1: # several anagrams in this group
print('Anagrams: {}'.format(', '.join(words)))
另外,考虑使用defaultdict
-这是一本字典,为新键创建指定类型的值。
from collections import defaultdict
with open('words.txt', 'r') as fp:
line = fp.readlines()
def make_anagram_dict(line):
d = defaultdict(list) # argument is the default constructor for value
for word in line:
word = word.lower() # call lower() only once
key = ''.join(sorted(word))
d[key].append(word) # now d[key] is always list
return d # just return the mapping to process it later
if __name__ == '__main__':
d = make_anagram_dict(line)
for words in d.values():
if len(words) > 1: # several anagrams in this group
print('Anagrams: {}'.format(', '.join(words)))
答案 1 :(得分:1)
我将假设您将文件中的单词归为一组。
如果另一方面,如果您被要求查找文件中单词列表的所有英语字谜,则需要一种确定什么是单词或不是单词的方法。这意味着您要么需要像set(<of all english words>)
中那样的实际“词典”,要么可能是非常复杂的谓词方法。
无论如何,这是一个相对简单的解决方案,它假设您的words.txt
足够小,可以完全读入内存:
with open('words.txt', 'r') as infile:
words = infile.read().split()
anagram_dict = {word : list() for word in words}
for k, v in anagram_dict.items():
k_anagrams = (othr for othr in words if (sorted(k) == sorted(othr)) and (k != othr))
anagram_dict[k].extend(k_anagrams)
print(anagram_dict)
这不是执行此操作的最有效方法,但希望它可以遍及过滤功能。
可以说,这里最重要的是if (sorted(k) == sorted(othr)) and (k != othr)
定义中的k_anagrams
过滤器。这是一个过滤器,仅允许相同的字母组合,但剔除精确匹配项。
答案 2 :(得分:0)
您的代码几乎在那里,只需要进行一些调整:
import re
def make_anagram_dict(words):
d = {}
for word in words:
word = word.lower() # call lower() only once
key = ''.join(sorted(word)) # make the key
if key in d: # check if it's in dictionary already
if word not in d[key]: # avoid duplicates
d[key].append(word)
else:
d[key] = [word] # initialize list with the initial value
return d # return the entire dictionary
if __name__ == '__main__':
filename = 'words.txt'
with open(filename) as file:
# Use regex to extract words. You can adjust to include/exclude
# characters, numbers, punctuation...
# This returns a list of words
words = re.findall(r"([a-zA-Z\-]+)", file.read())
# Now process them
d = make_anagram_dict(words)
# Now print them
for words in d.values():
if len(words) > 1: # we found anagrams
print('Anagram group {}: {}'.format(', '.join(words)))