出现此错误IndexError:列表索引超出范围

时间:2020-09-06 12:32:16

标签: python nlp nltk similarity wordnet

此程序旨在查找a句子和单词之间的相似性以及它们在同义词中的相似性。当我第一次对其进行编码时,我已经下载了nltk,并且在运行程序几天后没有任何错误{{ 3}}

import nltk
nltk.download('stopwords')
nltk.download('wordnet')
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.corpus import wordnet as wn

filtered_uploaded_sentences = []
uploaded_sentence_synset = []
database_word_synset = []

uploaded_doc_sentence=" The issue of text semantics, such as word semantics and sentence semantics has received increasing attentions in recent years. However, rare research focuses on the document-level semantic matching due to its complexity. Long documents usually have sophisticated structure and massive information, which causes hardship to measure their semantic similarity. The semantic similarity between words, sentences, texts, and documents is widely studied in various fields, including natural language processing, document semantic comparison, artificial intelligence, semantic web, and semantic search engines. "
database_word=["car","complete",'focus',"semantics"]

stopwords = stopwords.words('english')
uploaded_sentence_words_tokenized = word_tokenize(uploaded_doc_sentence)

#filtering the sentence and synset

for word in uploaded_sentence_words_tokenized:
    if word not in stopwords:      
        filtered_uploaded_sentences.append(word)
print (filtered_uploaded_sentences)

for sentences_are in filtered_uploaded_sentences:
    uploaded_sentence_synset.append(wn.synsets(sentences_are))
    
print(uploaded_sentence_synset)

#for finding similrity in the words

for databasewords in database_word:
    database_word_synset.append(wn.synsets(databasewords)[0])
    
print(database_word_synset)

索引错误:列表索引超出范围
当upload_doc_sentence简短而使用长句子时出现此错误

check.append(wn.wup_similarity(data,sen [0]))

我想比较句子和单词,然后存储结果。这种类型

#the similarity main function for words

for data in database_word_synset:
    for sen in uploaded_sentence_synset :
        check.append(wn.wup_similarity(data,sen[0]))
print(check)

2 个答案:

答案 0 :(得分:0)

问题在于uploaded_sentence_synset中包含空列表。我不确定您要做什么,但是将最后一段代码修改为:

for data in database_word_synset:
    for sen in uploaded_sentence_synset:
        if sen:
            check.append(wn.wup_similarity(data, sen[0]))

没有if-else块,您实际上是在尝试为列表的第一个元素建立索引,从而给您一个IndexError

答案 1 :(得分:0)

通过从列表中删除空的[]列表块并将多维列表转换为一维列表来解决问题

list2 = [x for x in main_sen if x != []]
print(list2)
result=list()
for t in list2: 
    for x in t: 
        result.append(x)

enter image description here