我尝试将 KNN 模型保存到anaconda中的 PMML 。但它没有用。
我的剧本:
#### load iris dataset
iris_dt = pd.read_csv('iris.csv' , header = 0)
#### Create development and evaluation samples
X_train_dev, X_test, y_train_dev, y_test = train_test_split(iris_dt.ix[:, 0:4],
iris_dt['Species'],
test_size=0.05,
random_state=36851235,
stratify=iris_dt['Species'])
#### Train KNNClassifier
# tune CV
crossv = StratifiedKFold(n_splits=10, random_state=36851234)
# tune GridSearchCV parameters
param_grid = {'n_neighbors': np.arange(1, 30)}
knn = KNeighborsClassifier()
knn_randomcv = RandomizedSearchCV(knn,
param_grid ,
n_iter = 15,
scoring = 'f1_weighted',
cv = crossv,
random_state=36851232)
knn_randomcv = knn_randomcv.fit(X_train_dev, y_train_dev)
# choose best estimator
knn_best_random = knn_randomcv.best_estimator_
#### Save best estimator like pmml
pipeline = PMMLPipeline([("knn_best_estimator",knn_randomcv.best_estimator_)])
pipeline.active_fields = X_train_dev.columns.values
pipeline.target_field = y_train_dev.name
sklearn2pmml(pipeline, "KNNFit_py.pmml", debug = 'True')
我的调试日志:
当我尝试启动java转换器时,我得到更详细的错误:
SEVERE: Failed to convert
java.lang.ClassCastException: numpy.core.Scalar cannot be cast to java.lang.Number
at sklearn.neighbors.KNeighborsClassifier.getNumberOfNeighbors(KNeighborsClassifier.java:70)
at sklearn.neighbors.KNeighborsUtil.encodeNeighbors(KNeighborsUtil.java:130)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:57)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:32)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:161)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Exception in thread "main" java.lang.ClassCastException: numpy.core.Scalar cannot be cast to java.lang.Number
at sklearn.neighbors.KNeighborsClassifier.getNumberOfNeighbors(KNeighborsClassifier.java:70)
at sklearn.neighbors.KNeighborsUtil.encodeNeighbors(KNeighborsUtil.java:130)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:57)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:32)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:161)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
请帮助。
答案 0 :(得分:1)
根据文件:
n_neighbors : int, optional (default = 5)
Number of neighbors to use by default for kneighbors queries.
n_neighbors
应该是一个简单的int
。
执行np.arange(1, 30)
时,它会返回numpy.int64
,而不是内置int
的python。 Sklearn-jpmml无法处理numpy.int64代替int我认为错误:
numpy.core.Scalar cannot be cast to java.lang.Number
更改为:
param_grid = {'n_neighbors': range(1, 30)}
并且错误将消失。
编辑:发布了一个github issue on the problem here。