Question

基本上我想要做的是创建一个程序，将一个句子/段落作为用户输入，查找每个单词的同义词，并用同义词替换该单词。到目前为止我创建的程序运行完美，但有一些扭结/人为错误/逻辑错误。这就是我现在所拥有的：

response=input("Enter what you want to thesaurize")
orig=response #puts user input into a string
num=orig.count(" ")+1 #finds number of words in the sentence
orig=orig.split(" ") #turns the sentence into a list of its words
new=[] #creates a new list to put the new words in, in case I'd want to go back to the original sentence for any reason
for i in range (num):
        if orig[i] not in badWords: #makes sure that the word is not a no-synonym word like "the" or "be"
            new.insert(i, myFuncs.replace(orig[i])) #the replace function (which I put in a separate module for neatness purposes) looks up the word on thesaurus.com and replaces it with a synonym
        else:
            new.insert(i, orig[i]) #If the word is an excluded word, it simply leaves it alone and keeps it in the new sentence

final="" #creates an empty string to put the new words in
for j in range(0,num):
    final=final+new[j]+" "  #turns the list of new words into a string that can be printed
print(final)

同样，这完美运行，但存在一些问题。基本上，我把它减少到了4个基本问题：

1）单词没有同义词但仍然不在排除的单词列表中;

2）输入一个单词的错误含义，或者返回一个在用户输入的上下文中没有意义的含义;

3）返回动词的错误时态

4）当输入一个名词时，会返回该单词的动词，反之亦然（即“我将烤鸡肉”变成“我会将鸡肉火上浇油”或类似的东西）。

基本上，我可以手动修复所有这些问题，让用户浏览每个没有意义的单词，然后使用嵌套的if-else和其他控制结构来引导他们选择正确的单词，但是我认为这对用户来说是乏味的，并且会破坏整个观点，特别是如果他们输入的内容很多。

基本上我问，我可以自动化哪些问题？也就是说，有什么方法可以编写代码让计算机识别出这些问题吗？修复它们很容易，但实际上让程序识别逻辑错误而不是让用户处理它是困难的部分。

Answer 1

你应该研究NLP（自然语言处理），特别是POS标记（Part-of-speech tagging）。 POS标记将根据动词，名词等单词类别以及单词的语法形式标记文本语料库中的每个单词。一个很好的Python库将是NLTK，即自然语言工具包。

以下是该项目网站的一个小例子。

>>> import nltk
>>> sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
>>> tagged = nltk.pos_tag(tokens)
>>> tagged[0:6]
[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
('Thursday', 'NNP'), ('morning', 'NN')]

在标记您的句子后，只提取您要查找同义词的单词类别，将单词标准化为现在时，进行同义词查找并将结果转换回正确时态并替换所需的单词。

解决单词的时态也可以通过NLP机制实现，也可以将单词从正常形式转换为特定时态。

Python程序用同义词替换句子中的每个单词

1 个答案: