我需要一个包含与模式完全相同的单词的输出-仅在相同位置出现相同的字母(并且单词不应在其他位置显示) 例如:
words = ['hatch','catch','match','chat','mates']
pattern = '_atc_
所需的输出:
['hatch','match']
我尝试使用嵌套的循环,但不适用于以'_'开头和结尾的模式
def filter_words_list(words, pattern):
relevant_words = []
for word in words:
if len(word) == len(pattern):
for i in range(len(word)):
for j in range(len(pattern)):
if word[i] != pattern[i]:
break
if word[i] == pattern[i]:
relevant_words.append(word)
thx!
答案 0 :(得分:1)
您可以使用regular expression:
import re
words = ['hatch','catch','match','chat','mates']
pattern = re.compile('[^atc]atc[^atc]')
result = list(filter(pattern.fullmatch, words))
print(result)
输出
['hatch', 'match']
模式'[^atc]atc[^atc]'
匹配不等于a或t或c([^atc]
)的所有内容,后跟'atc'
匹配不等于a或t或c的所有内容。
作为替代方案,您可以编写自己的匹配函数,该函数将与任何给定模式一起使用:
from collections import Counter
def full_match(word, pattern='_atc_'):
if len(pattern) != len(word):
return False
pattern_letter_counts = Counter(e for e in pattern if e != '_') # count characters that are not wild card
word_letter_counts = Counter(word) # count letters
if any(count != word_letter_counts.get(ch, 0) for ch, count in pattern_letter_counts.items()):
return False
return all(p == w for p, w in zip(pattern, word) if p != '_') # the word must match in all characters that are not wild card
words = ['hatch', 'catch', 'match', 'chat', 'mates']
result = list(filter(full_match, words))
print(result)
输出
['hatch', 'match']
进一步
答案 1 :(得分:1)
因此,您应该使用正则表达式。并用“。”替换下划线。表示任何单个字符。 因此输入如下:
words = ['hatch','catch','match','chat','mates']
pattern = '.atc.'
,代码为:
import re
def filter_words_list(words, pattern):
ret = []
for word in words:
if(re.match(pattern,word)):ret.append(word)
return ret
希望得到帮助