查找字符串中单词的semordnilap(反向字谜)

时间:2018-12-21 04:34:32

标签: python python-3.x string anagram

我正在尝试输入一个字符串,例如一个句子,并找到所有在句子中带有反向词的词。到目前为止,我已经知道了:

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

def semordnilap(s):
    s = s.lower()
    b = "!@#$,"
    for char in b:
        s = s.replace(char,"")
    s = s.split(' ')

    dict = {}
    index=0
    for i in range(0,len(s)):
        originalfirst = s[index]
        sortedfirst = ''.join(sorted(str(s[index])))
        for j in range(index+1,len(s)):
            next = ''.join(sorted(str(s[j])))
            if sortedfirst == next:
                dict.update({originalfirst:s[j]})
        index+=1

    print (dict)

semordnilap(s)

因此,这在大多数情况下都是有效的,但是如果运行它,您会发现它还将“ he”和“ he”配对为一个字谜,但这并不是我想要的。如果要输入较大的文本文件,则有关如何修复它的任何建议,以及是否有可能使运行时间更快。

1 个答案:

答案 0 :(得分:1)

您可以将字符串拆分为单词列表,然后比较所有组合的小写版本,其中一对相反。以下示例使用re.findall()将字符串拆分为单词列表,并使用itertools.combinations()进行比较:

import itertools
import re

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

words = re.findall(r'\w+', s)
pairs = [(a, b) for a, b in itertools.combinations(words, 2) if a.lower() == b.lower()[::-1]]

print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]

编辑:我仍然更喜欢上面的解决方案,但是根据您对不导入任何程序包的意见,请参见下文。但是,请注意,根据文本的性质,以这种方式使用的str.translate()可能会产生意想不到的后果(例如,从电子邮件地址中删除@),换句话说,与标点符号相比,您可能需要更仔细地处理标点符号这个。另外,我通常会import string并使用string.punctuation而不是我要传递给str.translate()的标点符号的文字字符串,但是为了避免在不导入的情况下执行此操作,下面避免了这种情况。

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

words = s.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~').split()
length = len(words)
pairs = []
for i in range(length - 1):
    for j in range(i + 1, length):
        if words[i].lower() == words[j].lower()[::-1]:
            pairs.append((words[i], words[j]))

print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]