模型的功能数量必须与python

时间:2019-05-20 04:38:28

标签: python text-classification

我使用以下链接创建了文本分类器模型:https://stackabuse.com/text-classification-with-python-and-scikit-learn/ 然后,我尝试对其进行检查并与自己的数据(不是数据集)一起使用。但是,它说功能数量不匹配。 这是我的代码:

import pickle 
with open('text_classifier', 'rb') as training_model:  
    model = pickle.load(training_model)

f = open(r".\descriptions_dataset\image\0.txt", "r")

test = f.read()
print(test)
f.close()

import re
from nltk.stem import WordNetLemmatizer
stemmer = WordNetLemmatizer()
document = re.sub(r'\W', ' ', str(test))
document = re.sub(r'\s+[a-zA-Z]\s+', ' ', document)
document = re.sub(r'\^[a-zA-Z]\s+', ' ', document) 
document = re.sub(r'\s+', ' ', document, flags=re.I)
document = re.sub(r'^b\s+', '', document)
document = document.lower()
document = document.split()
document = [stemmer.lemmatize(word) for word in document]
print(document)

nltk.download('stopwords')
from nltk.corpus import stopwords

from sklearn.feature_extraction.text import CountVectorizer  
vectorizer = CountVectorizer(max_features=1500, min_df=1, max_df=0.7, stop_words=stopwords.words('english'))  
X = vectorizer.fit_transform(document).toarray()

from sklearn.feature_extraction.text import TfidfTransformer  
tfidfconverter = TfidfTransformer()  
X = tfidfconverter.fit_transform(X).toarray()  

y_pred = model.predict(X)

ValueError跟踪(最近一次通话最近)  在 ----> 1 y_pred = model.predict(X)

〜\ Anaconda3 \ lib \ site-packages \ sklearn \ ensemble \ forest.py在预报中(self,X)     541预测的类别。     542“”“ -> 543 proba = self.predict_proba(X)     544     第545章真相(1)

〜\\ Anaconda3 \ lib \ site-packages \ sklearn \ ensemble \ forest.py在Forecast_proba中(self,X)     581 check_is_fitted(self,'estimators_')     582#检查数据 -> 583 X = self._validate_X_predict(X)     584     585#为作业分配树木

〜\ Anaconda3 \ lib \ site-packages \ sklearn \ ensemble \ forest.py in _validate_X_predict(self,X)     360“在利用模型之前调用fit。”)     361 -> 362返回self.estimators_ [0] ._ validate_X_predict(X,check_input = True)     363     364 @属性

〜\ Anaconda3 \ lib \ site-packages \ sklearn \ tree \ tree.py在_validate_X_predict(self,X,check_input)中     386“匹配输入。模型n_features是%s和”     387“输入n_features是%s” -> 388%(self.n_features_,n_features))     389     390返回X

ValueError:模型的特征数量必须与输入匹配。模型n_features是1500,输入n_features是86

0 个答案:

没有答案