Question

我正在尝试输入一个字符串，例如一个句子，并找到所有在句子中带有反向词的词。到目前为止，我已经知道了：

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

def semordnilap(s):
    s = s.lower()
    b = "!@#$,"
    for char in b:
        s = s.replace(char,"")
    s = s.split(' ')

    dict = {}
    index=0
    for i in range(0,len(s)):
        originalfirst = s[index]
        sortedfirst = ''.join(sorted(str(s[index])))
        for j in range(index+1,len(s)):
            next = ''.join(sorted(str(s[j])))
            if sortedfirst == next:
                dict.update({originalfirst:s[j]})
        index+=1

    print (dict)

semordnilap(s)

因此，这在大多数情况下都是有效的，但是如果运行它，您会发现它还将“ he”和“ he”配对为一个字谜，但这并不是我想要的。如果要输入较大的文本文件，则有关如何修复它的任何建议，以及是否有可能使运行时间更快。

Answer 1

您可以将字符串拆分为单词列表，然后比较所有组合的小写版本，其中一对相反。以下示例使用re.findall()将字符串拆分为单词列表，并使用itertools.combinations()进行比较：

import itertools
import re

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

words = re.findall(r'\w+', s)
pairs = [(a, b) for a, b in itertools.combinations(words, 2) if a.lower() == b.lower()[::-1]]

print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]

编辑：我仍然更喜欢上面的解决方案，但是根据您对不导入任何程序包的意见，请参见下文。但是，请注意，根据文本的性质，以这种方式使用的str.translate()可能会产生意想不到的后果（例如，从电子邮件地址中删除@），换句话说，与标点符号相比，您可能需要更仔细地处理标点符号这个。另外，我通常会import string并使用string.punctuation而不是我要传递给str.translate()的标点符号的文字字符串，但是为了避免在不导入的情况下执行此操作，下面避免了这种情况。

s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"

words = s.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~').split()
length = len(words)
pairs = []
for i in range(length - 1):
    for j in range(i + 1, length):
        if words[i].lower() == words[j].lower()[::-1]:
            pairs.append((words[i], words[j]))

print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]

查找字符串中单词的semordnilap（反向字谜）

1 个答案: