Question

给出的字母：字母的例子

letters = 'hutfb'

我收到了一个包含单词列表的文件。

我需要编写一个递归函数，它允许我检查字母可以做出的所有可能性。如果可能性在文件中的单词列表中，我需要打印该特定单词。

所以给出的信件

他们可以创建单词：

一个
猫
AC
行为
驾驶室

依此类推

字母组成的每个组合我需要检查文件以查看它是否是有效字。如果是的话我需要打印它们。

我不知道如何开始写这个功能。

Answer 1

不幸的是，我现在无法用递归函数帮助，但考虑到如果在创建期间没有过滤，更多的字母/字符可以很容易地爆炸成数十亿个潜在的组合，我通过迭代已知的单词有一个古怪的选择。无论如何，那些必须在记忆中。

[编辑]删除了排序，因为它没有真正提供任何好处，修复了我在迭代时错误地设置为true的问题

# Some letters, separated by space
letters = 'c a t b'
# letters = 't t a c b'

# # Assuming a word per line, this is the code to read it
# with open("words_on_file.txt", "r") as words:
#     words_to_look_for = [x.strip() for x in words]
#     print(words_to_look_for)

# Alternative for quick test
known_words = [
    'cat',
    'bat',
    'a',
    'cab',
    'superman',
    'ac',
    'act',
    'grumpycat',
    'zoo',
    'tab'
]

# Create a list of chars by splitting
list_letters = letters.split(" ")

for word in known_words:
    # Create a list of chars
    list_word = list(word)
    if len(list_word) > len(list_letters):
        # We cannot have longer words than we have count of letters
        # print(word, "too long, skipping")
        continue

    # Now iterate over the chars in the word and see if we have
    # enough chars/letters
    temp_letters = list_letters[:]

    # This was originally False as default, but if we iterate over each
    # letter of the word and succeed we have a match
    found = True
    for char in list_word:
        # print(char)
        if char in temp_letters:
            # Remove char so it cannot match again
            # list.remove() takes only the first found
            temp_letters.remove(char)
        else:
            # print(char, "not available")
            found = False
            break

    if found is True:
        print(word)

您可以从itertools documentation复制并粘贴产品功能并使用ExtinctSpecie提供的代码，它没有进一步的依赖关系，但我发现没有调整它会返回所有可能的选项，包括我做过的字符重复不能马上理解。

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

Answer 2

我同意@dparolin关于处理单词文件以查看单词是否符合字母，不生成可能的单词并查看它们是否在文件中。这使我们不会将文件读入内存，因为我们只需要一次检查一个单词。它可以通过递归测试完成：

letters = 'catbt'

def is_match(letters, word):

    if not word:
        return True

    if not letters:
        return False

    letter = letters.pop()

    if letter in word:
        word.remove(letter)

    return is_match(letters, word)

with open('words.txt') as words:
    for word in words:
        word = word.strip()

        if is_match(list(letters), list(word)):
            print(word)

示例用法

% python3 test.py
act
at
bat
cab
cat
tab
tact
%

我们应该可以毫无问题地处理大量的信件。

Answer 3

import itertools
str = "c a t b"
letters = list(str.replace(" ",""))
words_to_look_for = []

for index, letter in enumerate(letters):
    keywords = [''.join(i) for i in itertools.product(letters, repeat = index+1)]
    words_to_look_for.extend(keywords)

print(words_to_look_for)

https://stackoverflow.com/questions/7074051/....

Answer 4

如上所述，此不会可用于任何超过双手可数字母的数量。检查的可能性太多了。但是如果你试试这个，这就是代码的样子。

letters = ['a', 'b', 'c']

def powerset(letters):
    output = [set()]
    for x in letters:
        output.extend([y.union({x}) for y in output])
    return output

for subset in powerset(letters):
    for potential_word in map(''.join, itertools.permutations(list(subset))):
        # Check if potential_word is a word

这不会尝试带有重复字母的单词（这将是另一层精神错乱），但它会尝试所有可能的潜在单词，这些单词可能由您以任何顺序提供的字母子集形成。

[edit]刚刚意识到你要求一个递归的解决方案。 Dunno，如果需要或不需要，但powerset函数可以更改为递归。不过，我认为这会让人更难以理解。

wordsearch

4 个答案: