如何在Python Flask预测网络应用程序中集成SK-Learn Naive Bayes训练模型?

时间:2016-07-26 15:54:24

标签: python flask scikit-learn prediction naivebayes

我正在尝试使用SK-Learn的Naive Bayes分类器和Python Flask微框架构建预测工具。根据我用Google搜索,我可以在浏览器上加载应用程序时腌制模型然后取消模型,但我怎么能这样做呢?

我的应用应该接收用户输入值,然后将这些值传递给模型,然后将预测值显示给用户(作为d3图,因此需要将预测值转换为JSON格式)。

这是我迄今为止所尝试过的:

挑选模型

DatagramPacket.setAddress(new InetSocketAddress("/239.193.129.14", 3450))

收集用户输入的HTML表单:

from sklearn.naive_bayes import GaussianNB
import numpy as np 
import csv

def loadCsv(filename):
   lines = csv.reader(open(filename,"rb"))
   dataset = list(lines)
   for i in range(len(dataset)):
      dataset[i] = [float(x) for x in dataset[i]]
   return dataset

datasetX = loadCsv("pollutants.csv")
datasetY = loadCsv("acute_bronchitis.csv")

X = np.array(datasetX)
Y = np.array(datasetY).ravel()

model = GaussianNB()
model.fit(X,Y)

#import pickle
from sklearn.externals import joblib
joblib.dump(model,'acute_bronchitis.pkl')

Python Flask <form class = "prediction-options" method = "post" action = "/prediction/results"> <input type = "range" class = "prediction-option" name = "aqi" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">AQI</label> <input type = "range" class = "prediction-option" name = "pm2_5" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">PM2.5</label> <input type = "range" class = "prediction-option" name = "pm10" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">PM10</label> <input type = "range" class = "prediction-option" name = "so2" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">SO2</label> <input type = "range" class = "prediction-option" name = "no2" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">NO2</label> <input type = "range" class = "prediction-option" name = "co" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">CO</label> <input type = "range" class = "prediction-option" name = "o3" min = 0 max = 100 value = 0></input> <label class = "prediction-option-label">O3</label> <input type = "submit" class = "submit-prediction-options" value = "Get Patient Estimates" /> </form>

app.py

但是,我收到以下错误消息:from flask import Flask, render_template, request import json from sklearn.naive_bayes import GaussianNB import numpy as np import pickle as pkl from sklearn.externals import joblib model_acute_bronchitis = pkl.load(open('data/acute_bronchitis.pkl')) @app.route("/prediction/results", methods = ['POST']) def predict(): input_aqi = request.form['aqi'] input_pm2_5 = request.form['pm2_5'] input_pm10 = request.form['pm10'] input_so2 = request.form['so2'] input_no2 = request.form['no2'] input_co = request.form['co'] input_o3 = request.form['o3'] input_list = [[input_aqi,input_pm2_5,input_pm10,input_so2,input_no2,input_co,input_o3]] output_acute_bronchitis = model_acute_bronchitis.predict(input_list) prediction = json.dumps(output_acute_bronchitis) return prediction 我发现可能是因为使用sk-learn的joblib来挑选模型。

所以,我试着看看我是否可以使用joblib的加载函数来加载Flask中的模型,我得到了这个错误信息:

TypeError: 'NDArrayWrapper' object does not support indexing

我做错了什么?是否有更简单的替代方案来实现我希望实现的目标?

1 个答案:

答案 0 :(得分:2)

我认为您的代码存在的问题是表单中的数据被读取为字符串。例如,在input_aqi = request.form['aqi']中,input_aqi有一个字符串。因此,在output_acute_bronchitis = model_acute_bronchitis.predict(input_list)中,您最终会传递predict一个字符串数组,因为您会看到此错误。您可以通过简单地将所有表单输入转换为浮点数来解决此问题,如下所示:

input_aqi = float(request.form['aqi'])

您必须为input_list中的所有表单输入执行此操作。

希望有所帮助。