我使用python
和scikit-learn
进行分类。
是否可以重复使用分类器学习的参数?
例如:
from sklearn.svm import SVC
cl = SVC(...) # create svm classifier with some hyperparameters
cl.fit(X_train, y_train)
params = cl.get_params()
让我们将这个params
存储在字符串字典中,甚至可以写入文件json。假设,我们希望以后使用这种训练有素的分类器对某些数据做出一些预测。尝试恢复它:
params = ... # retrieve these parameters stored somewhere as a dictionary
data = ... # the data, we want make predictions on
cl = SVC(...)
cl.set_params(**params)
predictions = cl.predict(data)
如果我这样做,我会得到NonFittedError
和以下的堆栈跟踪:
File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 548, in predict
y = super(BaseSVC, self).predict(X)
File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 308, in predict
X = self._validate_for_predict(X)
File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 437, in _validate_for_predict
check_is_fitted(self, 'support_')
File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\utils\validation.py", line 768, in check_is_fitted
raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This SVC instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
是否可以将参数设置为分类器并进行预测而不适合?我该怎么做?
答案 0 :(得分:6)
请阅读model persistence in SKLearn:
from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl')
以及后来:
clf = joblib.load('filename.pkl')