烧瓶模型部署“模块'__main__'没有属性'clean_transformer'”错误?

时间:2020-08-08 19:43:39

标签: python flask deployment scikit-learn web-applications

我正在尝试使用Flask部署sklearn文本分类管道。当我尝试加载腌制的模型时,我一直收到标题中提到的错误。这是我的文件结构:

<?php
header('X-Powered-By: Riverside Rocks');
die();
?>

book_classifier.pkl是经过数据训练的以下管道的腌制版本:

webapp/
    ├── model/
    │   └── book_classifier.pkl
    ├── templates/
    │   └── main.html
    └── app.py
    └── preprocessing.py

下面是preprocessing.py的代码,其中具有必要的文本预处理步骤(即标记化,然后是您在上面看到的tfidf_vector和clean_transformer):

classifier = KNeighborsClassifier()

pipe = Pipeline([('clean_transformer', clean_transformer()),
                 ('vectorizer', tfidf_vector),
                 ('classifier', classifier)])

fitted_pipe = pipe.fit(X,y)

joblib.dump(fitted_pipe, 'book_classifier.pkl', compress=1)

最后,是app.py的代码:

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.base import TransformerMixin
from sklearn.pipeline import Pipeline

import spacy
from spacy.lang.en import English
parser = English()
from spacy.lang.en.stop_words import STOP_WORDS
stop_words = spacy.lang.en.stop_words.STOP_WORDS
import string
punctuations = string.punctuation


# tokenizer
def spacy_tokenizer(sentence):
    mytokens = parser(sentence)
    mytokens = [ word.lemma_.lower().strip() if word.lemma_ != "-PRON-" else word.lower_ for word in mytokens ]
    mytokens = [ word for word in mytokens if word not in stop_words and word not in punctuations ]

    return mytokens


# vectorizers
bow_vector = CountVectorizer(tokenizer = spacy_tokenizer, ngram_range=(1,1))

tfidf_vector = TfidfVectorizer(tokenizer = spacy_tokenizer)


# transformer
def clean_text(text):
    return text.strip().lower()

class clean_transformer(TransformerMixin):
    def transform(self, X, **transform_params):
        return [clean_text(text) for text in X]

    def fit(self, X, y=None, **fit_params):
        return self

    def get_params(self, deep=True):
        return {}

如前所述,错误发生在行import flask import joblib import pandas as pd from preprocessing import * model = joblib.load(open('model/book_classifier.pkl', 'rb')) app = flask.Flask(__name__, template_folder='templates') @app.route('/', methods=['GET', 'POST']) def main(): if flask.request.method == 'GET': return(flask.render_template('main.html')) if flask.request.method == 'POST': title = flask.request.form['title'] booktext = flask.request.form['booktext'] prediction = model.predict(booktext) return flask.render_template('main.html', original_input={'Book Title':title}, result=prediction), print(prediction) if __name__ == '__main__': app.run() 上。下面是完整的错误:

model = joblib.load(open('model/book_classifier.pkl', 'rb'))

我是Flask部署的新手,我不确定这是怎么回事。我很难解释错误消息。请注意,无论是从preprocessing.py导入还是将代码直接放在app.py中,问题仍然存在。任何帮助将不胜感激。

0 个答案:

没有答案