我正在使用django开发一个预测(机器学习)的Web应用程序。我使用线性回归模型是因为我必须预测定量变量“ sales”,并且在数据条目中有一个虚拟变量,因此我使用handle_non_numerical_data()对其进行编码。问题是用户在应用程序中输入条目的级别,我必须使用名称,而不是数字的覆盖范围。有什么办法吗? 错误是
ValueError at /
could not convert string to float: 'cafe'
Request Method: POST
Request URL: http://127.0.0.1:8000/
Django Version: 2.2
Exception Type: ValueError
Exception Value:
could not convert string to float: 'cafe'
Exception Location: C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\sklearn\utils\validation.py in check_array, line 448
Python Executable: C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\Scripts\python.exe
Python Version: 3.6.5
Python Path:
['C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\appweb pred',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f\\Scripts\\python36.zip',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f\\DLLs',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f\\lib',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f\\Scripts',
'c:\\users\\hp\\appdata\\local\\programs\\python\\python36\\Lib',
'c:\\users\\hp\\appdata\\local\\programs\\python\\python36\\DLLs',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f',
'C:\\Users\\hp\\AppData\\Local\\Programs\\Python\\Python36\\dj\\f\\lib\\site-packages']
Server time: Fri, 26 Apr 2019 10:20:37 +0000
Traceback Switch to copy-and-paste view
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\django\core\handlers\exception.py in inner
response = get_response(request) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\django\core\handlers\base.py in _get_response
response = self.process_exception_by_middleware(e, request) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\django\core\handlers\base.py in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\appweb pred\predict\views.py in product_describe_view
predicted_sales = operate_function(product_detail) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\appweb pred\predict\views.py in operate_function
vente_2015, vente_2016, vente_2017 …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\appweb pred\ml_code\ml_process\server_predictor.py in get_prediction
vente_2014, ventes_2015, ventes_2016, ventes_2017 …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\sklearn\linear_model\base.py in predict
return self._decision_function(X) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\sklearn\linear_model\base.py in _decision_function
X = check_array(X, accept_sparse=['csr', 'csc', 'coo']) …
▶ Local vars
C:\Users\hp\AppData\Local\Programs\Python\Python36\dj\f\lib\site-packages\sklearn\utils\validation.py in check_array
array = array.astype(np.float64) …
▶ Local vars
我的模型是
# Libraries
import numpy as np
import pandas as pd
import pickle
from matplotlib import pyplot as plt
from sklearn import metrics
from sklearn import model_selection
#from sklearn import preprocessing
from sklearn.cluster import KMeans
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
#from sklearn.linear_model import Ridge
from sklearn.externals import joblib
# Importing Dataset
data = pd.read_csv('ml_code/ml_process/test.csv')
data.fillna(0, inplace=True)
def handle_non_numerical_data(df):
columns = df.columns.values
for column in columns:
text_digit_vals = {}
def convert_to_int(val):
return text_digit_vals[val]
if df[column].dtype != np.int64 and df[column].dtype != np.float64:
column_contents = df[column].values.tolist()
unique_elements = set(column_contents)
x = 0
for unique in unique_elements:
if unique not in text_digit_vals:
text_digit_vals[unique] = x
x = x + 1
df[column] = list(map(convert_to_int, df[column]))
return df
data = handle_non_numerical_data(data)
data = data.as_matrix()
#X matrice des var. explicatives
X = data[:,0:9]
#y vecteur de la var. à prédire
y = data[:,9]
X2_train, X2_test, y2_train, y2_test = train_test_split(X, y, test_size=0.3, random_state=0)
lreg = LinearRegression()
lreg.fit(X2_train, y2_train)
print('Accuracy of linear regression on training set: {:.2f}'.format(lreg.score(X2_train, y2_train)))
print('Accuracy of linear regression on test set: {:.2f}'.format(lreg.score(X2_test, y2_test)))
y_pred2 = lreg.predict(X2_test)
print("Predicted Sales: %.3f" % (y_pred2[0]))
#fig,ax = plt.subplots()
#ax.scatter(data2['quarter'], data2['sales'], marker='*', s=300, c='#050505')
# Saving the Logistic Regression Model
linear_regression_model = pickle.dumps(lreg)
# Saving the model to a file
#with open('ml_code/linear_regression_model.pkl','wb') as f:
joblib.dump(linear_regression_model, 'ml_code/linear_regression_model.pkl')