索引错误:索引 -2 超出轴 0 的范围,大小为 1

时间:2021-01-29 11:45:09

标签: python django scikit-learn nltk

使用 nltk 和 sklearn 在 Django 中创建了一个聊天机器人

在发送消息时我遇到此错误

C:\Users\xxx\django\project\env\lib\site-packages\sklearn\feature_extraction\text.py:388: UserWarning: Your stop_words may be inconsistent with your preprocessing. Tokenizing the stop words generated 
tokens ['ha', 'le', 'u', 'wa'] not in stop_words.
  warnings.warn('Your stop_words may be inconsistent with '
Internal Server Error: /addmsg
Traceback (most recent call last):
  File "C:\Users\xxx\django\project\env\lib\site-packages\django\core\handlers\exception.py", line 34, in inner
    response = get_response(request)
  File "C:\Users\xxx\django\project\env\lib\site-packages\django\core\handlers\base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "C:\Users\xxx\django\project\env\lib\site-packages\django\core\handlers\base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "C:\Users\xxx\django\project\home\views.py", line 19, in addmsg        
    bot_msg(c)
  File "C:\Users\xxx\django\project\home\views.py", line 23, in bot_msg       
    c = give_response(msg)
  File "C:\Users\xxx\django\project\home\nlp.py", line 69, in give_response   
    return find_response(input_given)
  File "C:\Users\xxx\django\project\home\nlp.py", line 42, in find_response   
    idx = values.argsort()[0][-2]
IndexError: index -2 is out of bounds for axis 0 with size 1

这是发送消息响应的代码

我不知道为什么使用文件 somefile.txt 以及如何在其中存储数据

from sklearn.metrics.pairwise import cosine_similarity
import nltk
import numpy
import random
import string

from sklearn.feature_extraction.text import TfidfVectorizer
def find_response(response):
    chatbots_file = open(r'somefile.txt','r',errors = 'ignore')
    info_data = chatbots_file.read()
    info_data = info_data.lower()
    nltk.download('punkt')
    nltk.download('wordnet')
    sentence_tokens = nltk.sent_tokenize(info_data)
    word_tokens = nltk.word_tokenize(info_data)
    
    
    bot_response = ''
    sentence_tokens.append(response)
    TfidfVec = TfidfVectorizer(tokenizer = normalize, stop_words = 'english')
    tfidf = TfidfVec.fit_transform(sentence_tokens)
    
    values = cosine_similarity(tfidf[-1],tfidf)
    idx = values.argsort()[0][-2]
    flat = values.flatten()
    flat.sort()
    
    req_tfidf = flat[-2]
    if(req_tfidf==0):
        bot_response = bot_response + "I am sorry! I don't understand you"
    else:
        bot_response = bot_response + sentence_tokens[idx]
    return bot_response 

1 个答案:

答案 0 :(得分:1)

嘿,我通过向文件添加文本解决了我的问题。

确保 somefile.txt 不为空

在文件中添加一些文本