Question

我正在实现一些字符串替换器，记住这些转换

'thou sittest' → 'you sit'
'thou walkest' → 'you walk'
'thou liest' → 'you lie'
'thou risest' → 'you rise'

如果我保持天真，可以使用正则表达式来查找＆amp;替换，如thou [a-z]+est

但问题出现在以e结尾的英语动词中，因为基于上下文我需要在某些＆amp;中修剪est。在其余的

中仅修剪st

实现这一目标的快速解决方案是什么？

Answer 1

可能是最快速和最脏的：

import nltk
words = set(nltk.corpus.words.words())
for old in 'sittest walkest liest risest'.split():
    new = old[:-2]
    while new and new not in words:
        new = new[:-1]
    print(old, new)

输出：

sittest sit
walkest walk
liest lie
risest rise

更新。稍微不那么快又脏（例如适用于rotest→动词rot，而不是名词rote）：

from nltk.corpus import wordnet as wn
for old in 'sittest walkest liest risest rotest'.split():
    new = old[:-2]
    while new and not wn.synsets(new, pos='v'):
        new = new[:-1]
    print(old, new)

输出：

sittest sit
walkest walk
liest lie
risest rise
rotest rot

以“e”结尾的英语动词处理

1 个答案: