以前正在工作的python机器学习代码中的新错误

时间:2019-01-30 22:54:59

标签: python validation parallel-processing

在以前可以使用的该机器学习代码中,现在每次运行时都出现错误。

代码逐行运行直到达到:

cv_results = model_selection.cross_val_score(模型,X_train,Y_train,cv = kfold,得分=得分)

import pandas
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC

# Load dataset (contains floats and one boolean)
url = "\\File\\Path.csv"
names = ['Headers', 'Here', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'T/F']
dataset = pandas.read_csv(url, names=names)

# Split-out validation dataset
array = dataset.values
X = array[:,0:12]
Y = array[:,12]
validation_size = 0.10
seed = 7
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)

# Test options and evaluation metric
seed = 7
scoring = 'accuracy'

# Spot check algorithms
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC()))

# evaluate each model in turn
results = []
names = []
for name, model in models:
    kfold = model_selection.KFold(n_splits=10, random_state=seed)
    cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
    results.append(cv_results)
    names.append(name)
    msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
    print(msg)

# Compare Algorithms
fig = plt.figure()
fig.suptitle('Algorithm Comparison')
ax = fig.add_subplot(111)
plt.boxplot(results)
ax.set_xticklabels(names)
plt.show()

这是错误:

Warning (from warnings module):
  File "C:\Python\Python37-32\lib\site-packages\sklearn\linear_model\logistic.py", line 433
    FutureWarning)
FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.

Warning (from warnings module):
  File "C:\Python\Python37-32\lib\site-packages\sklearn\model_selection\_validation.py", line 542
    FutureWarning)
FutureWarning: From version 0.22, errors during fit will result in a cross validation score of NaN by default. Use error_score='raise' if you want an exception raised or error_score=np.nan to adopt the behavior from version 0.22.
Traceback (most recent call last):
  File "C:/Users/kgrey/Desktop/test.py", line 46, in <module>
    cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\parallel.py", line 917, in __call__
    if self.dispatch_one_batch(iterator):
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "C:\Python\Python37-32\lib\site-packages\sklearn\externals\joblib\parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "C:\Python\Python37-32\lib\site-packages\sklearn\model_selection\_validation.py", line 528, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\linear_model\logistic.py", line 1289, in fit
    check_classification_targets(y)
  File "C:\Python\Python37-32\lib\site-packages\sklearn\utils\multiclass.py", line 171, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'unknown'

代码可以正常运行,直到到达:

cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)

1 个答案:

答案 0 :(得分:1)

在设置y_train和y_validator变量后添加以下内容:

get "/"

当您读取y变量时,该变量将被存储为一个对象,因此sklearn不知道该如何处理(因此发生了ValueValue((Error Unknown label type Unknown label type :%r“ %y_type)。将Y_train和Y_test更改为float或int类型应该可以解决错误