我正在尝试输入一个字符串,例如一个句子,并找到所有在句子中带有反向词的词。到目前为止,我已经知道了:
s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"
def semordnilap(s):
s = s.lower()
b = "!@#$,"
for char in b:
s = s.replace(char,"")
s = s.split(' ')
dict = {}
index=0
for i in range(0,len(s)):
originalfirst = s[index]
sortedfirst = ''.join(sorted(str(s[index])))
for j in range(index+1,len(s)):
next = ''.join(sorted(str(s[j])))
if sortedfirst == next:
dict.update({originalfirst:s[j]})
index+=1
print (dict)
semordnilap(s)
因此,这在大多数情况下都是有效的,但是如果运行它,您会发现它还将“ he”和“ he”配对为一个字谜,但这并不是我想要的。如果要输入较大的文本文件,则有关如何修复它的任何建议,以及是否有可能使运行时间更快。
答案 0 :(得分:1)
您可以将字符串拆分为单词列表,然后比较所有组合的小写版本,其中一对相反。以下示例使用re.findall()
将字符串拆分为单词列表,并使用itertools.combinations()
进行比较:
import itertools
import re
s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"
words = re.findall(r'\w+', s)
pairs = [(a, b) for a, b in itertools.combinations(words, 2) if a.lower() == b.lower()[::-1]]
print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]
编辑:我仍然更喜欢上面的解决方案,但是根据您对不导入任何程序包的意见,请参见下文。但是,请注意,根据文本的性质,以这种方式使用的str.translate()
可能会产生意想不到的后果(例如,从电子邮件地址中删除@
),换句话说,与标点符号相比,您可能需要更仔细地处理标点符号这个。另外,我通常会import string
并使用string.punctuation
而不是我要传递给str.translate()
的标点符号的文字字符串,但是为了避免在不导入的情况下执行此操作,下面避免了这种情况。
s = "Although he was stressed when he saw his desserts burnt, he managed to stop the pots from getting ruined"
words = s.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~').split()
length = len(words)
pairs = []
for i in range(length - 1):
for j in range(i + 1, length):
if words[i].lower() == words[j].lower()[::-1]:
pairs.append((words[i], words[j]))
print(pairs)
# OUTPUT
# [('was', 'saw'), ('stressed', 'desserts'), ('stop', 'pots')]