我有一个简单的python程序来查找句子是否是一个问题。
from nltk.tokenize import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer
a = ["what are you doing","are you mad?","how sad"]
question =["what","why","how","are","am","should","do","can","have","could","when","whose","shall","is","would","may","whoever","does"];
word_list =["i","me","he","she","you","it","that","this","many","someone","everybody","her","they","them","his","we","am","is","are","was","were","should","did","would","does","do"];
def f(paragraph):
sentences = paragraph.split(".")
result = []
for i in range(len(sentences)):
token = word_tokenize(sentences[i])
change_tense = [WordNetLemmatizer().lemmatize(word, 'v') for word in token]
input_sentences = [item.lower() for item in change_tense]
if input_sentences[-1]=='?':
result.append("question")
elif input_sentences[0] in question:
find_question = [input_sentences.index(qestion) for qestion in input_sentences if qestion in question]
if len(find_question) > 0:
for a in find_question:
if input_sentences[a + 1] in word_list:
result.append("question")
else:
result.append("not a question")
else:
result.append("not a quetion")
return result
my_result = [f(paragraph) for paragraph in a]
print my_result
但它会出现以下错误。
if input_sentences[a + 1] in word_list:
IndexError: list index out of range
我认为问题导致找到a的下一个元素值。任何人都可以帮我解决这个问题。
答案 0 :(得分:1)
问题是input_sentences.index(qestion)
可以返回input_sentences
的最后一个索引,这意味着a + 1
将比input_sentences
中的元素大一个,这会导致当您尝试访问IndexError
中不存在的列表元素时if input_sentences[a + 1] in word_list:
。
你是检查“下一个元素”的逻辑因此是不正确的,列表中的最后一个元素没有“下一个元素”。查看单词列表时,What should I do
之类的问题将会失败,因为do
将被选为问题词,但之后没有任何内容(假设您删除标点符号)。因此,您需要重新考虑检测问题的方式。