Question

Azure ML Studio环境在使用自定义python模型中的pickle文件时会引发以下错误。如果是python本地模型，则pickle文件可以在本地环境下正常工作，但不能在Azure ML Studio环境中工作

错误0085：在脚本评估期间发生以下错误，请查看输出日志以获取更多信息： ----------来自Python解释器的错误消息开始---------- 执行函数时捕获异常：追溯（最近一次调用为last）：批处理文件“ C：\ server \ invokepy.py”，行199 odfs = mod.azureml_main（* idfs）在azureml_main中的文件“ C：\ temp \ b1cb10c870d842b9afcf8bb8037155a1.py”，第49行返回DATA，model.predict_proba（DATA）在predict_proba中，文件“ C：\ pyhome \ lib \ site-packages \ sklearn \ ensemble \ forest.py”，第540行 n_jobs，_，_ = _partition_estimators（self.n_estimators，self.n_jobs） _partition_estimators中的第101行的文件“ C：\ pyhome \ lib \ site-packages \ sklearn \ ensemble \ base.py” n_jobs =分钟（_get_n_jobs（n_jobs），n_estimators） _get_n_jobs中的文件“ C：\ pyhome \ lib \ site-packages \ sklearn \ utils__init __。py”，行456 如果n_jobs <0： TypeError：不可排序的类型：NoneType（）

什么都不见了？

Python Pickle文件在本地环境下可以正常工作。

# The script MUST contain a function named azureml_main
# which is the entry point for this module.

# imports up here can be used to
import pandas as pd
import sys
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
import numpy as np
import pickle
import os

def azureml_main(DATA = None, dataframe2 = None):

# Execution logic goes here
# print('Input pandas.DataFrame #1:\r\n\r\n{0}'.format(DATA))

# If a zip file is connected to the third input port is connected,
# it is unzipped under ".\Script Bundle". This directory is added
# to sys.path. Therefore, if your zip file contains a Python file
# mymodule.py you can import it using:
# import mymodule

sys.path.append('.\\Script Bundle\\MyLocalModel.zip')
sys.path.insert(0,".\Script Bundle")
model = pickle.load(open(".\Script Bundle\MyLocalModel.pkl", 'rb'))

#result = pd.DataFrame(model.predict_proba(dataframe1), columns=['p0','p1'])

# Return value must be of a sequence of pandas.DataFrame
return DATA, model.predict_proba(DATA)

需要在azure ml studio中使用python自定义模型，以作为Web服务进行部署，并具有与本地模型相同的输出

4月17日的Update1：

Python版本2.7.11在本地和Azure ML Studio中是相同的，但发现sklearn版本在本地[0.18.x]和Azure ML Studio [0.15.x]中是不同的，其中train_test_split不同如下面的代码：

##from sklearn.model_selection import train_test_split ## works only with 0.18.x
import sklearn
from sklearn.cross_validation import train_test_split ## works only with 0.15.x
print ('sklearn version {0}'.format(sklearn.__version__))

1）现在，如何将sklearn软件包更新到Azure ML Studio中的最新版本？或者另一种方法是使我的本地sklearn降级，以进行尝试，将对此进行试验。

2）另一个练习是使用MDF [MulticlassDecisionForest]算法在Azure ML Studio中创建模型。并且本地使用RFC [RandomForestClassifier]算法，但是两个输出完全不同，不匹配？

在使用RFC算法的sklearn版本0.18.x的本地环境中的以下代码： ##在本地环境中使用sklearn版本0.18.x的随机森林分类器从sklearn.ensemble导入RandomForestClassifier

## Random Forest Classifier
rfc = RandomForestClassifier(n_estimators = 550,max_depth = 6,max_features = 30,random_state = 0) 
rfc.fit(X_train,y_train)
print (rfc)

## Accuracy test
accuracy = rfc.score(X_test1,y_test1)
print ("Accuracy is {}".format(accuracy))

3）已使用较低版本的sklearn版本0.15.x使用Azure ML Studio Execute Python Script复制了本地python代码除了很少的测试数据集行外，这也导致了相同的本地输出。现在，如何将Python脚本中的模型作为未训练的模型输入训练到“训练模型”组件？还是将pickle文件写入DataSet中，并用作自定义模型？

非常感谢您的宝贵意见。

Answer 1

最可能的原因是，您用于序列化模型的pickle的版本与Azure ML Studio用于反序列化的版本不同。检查Execute Python Script属性以查看可用的Anaconda / Python版本。

Azure ML Studio环境错误0085中的Python自定义模型，在本地环境中工作正常

1 个答案: