加载泡菜文件时的python错误消息

时间:2020-08-26 17:34:48

标签: python nlp

我有以下代码:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.utils import shuffle
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from nltk.tokenize import sent_tokenize, word_tokenize
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
import sklearn.metrics as metrics
import pickle

#%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

stemmer = PorterStemmer()
words = stopwords.words("english")

from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer_tfidf = TfidfVectorizer(stop_words='english', max_df=0.7)

# call and load pickle here
content = pickle.load(open("vectorizer.pk",'rb'))

vectorizer_tfidf = [vectorizer_tfidf]
test_tfIdf = vectorizer_tfidf.transform('processedtext')
test_tfIdf2 = vectorizer_tfidf.transform('processedtext2')

testdata = pd.read_csv('C:\\Users\\joyce\\Desktop\\CR_Summary 08052020.csv', delimiter = ',')
content = pickle.load(open("Pickle_RL_Model.pkl",'rb'))
 
##print (content)    
testdata=testdata.fillna(value='test')

#Array to return prediction
content.predict(testdata)

错误消息:

文件“ C:/ Users / joyce / nltk CR数据v3.py”,第42行,在 test_tfIdf = vectorizer_tfidf.transform('processedtext') AttributeError:“列表”对象没有属性“变换”

如何纠正此错误?

2 个答案:

答案 0 :(得分:1)

错误的意思 transform 函数不能应用于列表。
在您的情况下, vectorizer_tfid 是一个列表,这就是显示错误的原因。

vectorizer_tfidf = [vectorizer_tfidf]

此行创建列表。
尝试将其删除。

答案 1 :(得分:1)

请参考python docs,您会看到正在调用 list 对象上的 transform() 方法不支持。请拜访 sklearn docs,以了解有关正确用法的更多信息。

至少,您可以删除此通话:

vectorizer_tfidf = [vectorizer_tfidf]