我试图创建一个可以将句子转换为问题的算法。这是代码:
def sentence_to_question(arg):
hverbs = ["is", "have", "had", "was", "could", "would", "will", "do", "did", "should", "shall", "can", "are"]
words = arg.split(" ")
zen_sim = (0, "", "")
for hverb in hverbs:
for word in words:
similarity = SequenceMatcher(None, word, hverb).ratio()*100
if similarity > zen_sim[0]:
zen_sim = (similarity, hverb, word)
if zen_sim[0] < 30:
raise ValueError("unable to create question.")
else:
words.remove(zen_sim[2])
words = " ".join(words)[0].lower() + " ".join(words)[1:]
question = "{0} {1}?".format(zen_sim[1].capitalize(), words)
return question
解释: 有一个准备好的帮助动词列表,句子的每个单词都与帮助动词进行比较。将选择具有最高相似性帮助动词的词。 difflib.SequenceMatcher&#39; s Ratcliff/Obershep pattern recognition algorithm用于比较两个字符串的相似性。虽然如果相似度百分比小于30%,我认为用户可以拼错帮助动词的可能性非常低,并且在问题中不存在帮助动词的概率很高,因此问题无法识别。最后,选择的帮助动词放在字符串的开头。 (我知道&#34;算法&#34;需要更多优化)。
测试示例:
>>> sentence_to_question("Euclid was a prominent mathematician")
'Was euclid a prominent mathematician?'
>>> sentence_to_question("Euclid waz a prominent mathematician") # does work with small typos
'Was euclid a prominent mathematician?'
>>> sentence_to_question("The company name, logo and slogan are vital elements of the house style of a company and important elements of corporate design")
'Are the company name, logo and slogan vital elements of the house style of a company and important elements of corporate design?'
现在我们知道,可以从一个句子生成多个问题,但目前我的算法&#34;只能生成单个问题。但是,产生这些问题的最准确方法是什么? -
什么是公司家居风格和企业设计重要元素的重要元素?
谁是欧几里德?
请注意这两个问题之间的区别,第一个问题是什么作为其初始动词,但第二个问题谁。
我没有要求代码(虽然我会欣赏例子),从句子中生成多个问题的最准确方法是什么?
谢谢!