Question

我正在尝试编写一个带有两个用户输入的函数：一个单词和一个最大长度。该函数从文本文件中读取（在程序的前面打开），查看所有符合给定最大长度的单词，并返回文件中包含用户给出的单词中所有字母的单词列表。。到目前为止，这是我的代码：

def comparison():
    otherWord = input("Enter word: ")
    otherWord = list(otherWord)
    maxLength = input("What is the maximum length of the words you want: ")
    listOfWords = []
    for line in file:
        line = line.rstrip()
        letterCount = 0
        if len(line) <= int(maxLength):
            for letter in otherWord:
                if letter in line:
                    letterCount += 1
            if letterCount == len(otherLine):
                listOfWords.append(line)
    return listOfWords

此代码有效，但我的问题是它没有考虑从文件中读取的单词中的重复字母。例如，如果我输入“GREEN”作为otherWord，则该函数返回包含字母G，R，E和N的单词列表。我希望它返回包含具有2个E的单词的列表。我想我也必须用letterCount部分进行一些调整，因为重复会影响它，但我现在更关心识别重复项。任何帮助将不胜感激。

Answer 1

您可以使用otherWord的计数器，如下所示：

>>> from collections import Counter
>>> otherWord = 'GREEN'
>>> otherWord = Counter(otherWord)
>>> otherWord
Counter({'E': 2, 'R': 1, 'N': 1, 'G': 1})

然后你的支票看起来像这样：

if len(line) <= int(maxLength):
    match = True
    for l, c in counter.items():
        if line.count(l) < c:
            match = False
            break
    if match:
        listOfWords.append(line)

你也可以使用Python的for..else结构在没有match变量的情况下编写它：

if len(line) <= int(maxLength):
    for l, c in counter.items():
        if line.count(l) < c:
            break
    else:
        listOfWords.append(line)

编辑：如果你想要在字符数上有一个完全匹配，请检查是否相等，并进一步检查是否有任何额外的字符（如果行长度不同则是这种情况）。

Answer 2

您可以使用collections.Counter来执行（多个）设置操作：

In [1]: from collections import Counter

In [2]: c = Counter('GREEN')

In [3]: l = Counter('GGGRREEEENN')

In [4]: c & l  # find intersection
Out[4]: Counter({'E': 2, 'R': 1, 'G': 1, 'N': 1})

In [5]: c & l == c  # are all letters in "GREEN" present "GGGRREEEENN"?
Out[5]: True

In [6]: c == l  # Or if you want, test for equality
Out[6]: False

所以你的功能可能会变成：

def word_compare(inputword, wordlist, maxlenght):
    c = Counter(inputword)
    return [word for word in wordlist if maxlenght <= len(word) 
                                      and c & Counter(word) == c]

比较两个字符串，包括重复的字母？

2 个答案: