如何为mlr编写要作为Azure服务上传到AzureML的预测函数?

时间:2019-04-02 22:34:07

标签: r azure-machine-learning-studio mlr

我正在尝试将Azure模型中的R模型作为Web服务上载,模型在R中使用mlr包及其预测功能,对于像回归I这样的线性模型,mlr预报的输出是“ PredictionClassif”“ Prediction”表使用

PredictAction <- function(inputdata){
  predict(RegModel, inputdata, type="response")
}

这在Azure中运行正常。

当我使用mlr包以预测类型概率进行分类时,我必须将预测函数编写为

PredictAction <- function(inputdata){
  require(mlr)
  predict(randomForest,newdata=inputdata)
}

调用函数时

publishWebService(ws, fun, name, inputSchema)

它产生错误为

converting `inputSchema` to data frame
Error in convertArgsToAMLschema(lapply(x, class)) : 
  Error: data type "table" not supported

因为预测函数会生成一个我不知道如何转换或修改的表,所以我给出了输出方案

publishWebService(ws, fun, name, inputSchema,outputschema)

我不确定如何指定输出方案https://cran.r-project.org/web/packages/AzureML/AzureML.pdf

outputschema是一个列表, mlr的预测函数产生类的输出

class(pred_randomForest)
"PredictionClassif" "Prediction"

并且数据输出是一个数据框

class(pred_randomForest$data)
"data.frame"

我正在寻求有关publishWebService函数中outputchema语法的帮助,或者是否需要添加该函数的任何其他参数。不确定问题在哪里,AzureML是否无法读取包装的模型,还是在AzureML中正确执行mlr的预测功能。

在AzureML中出现以下错误

Execute R Script Piped (RPackage) : The following error occurred during evaluation of R script: R_tryEval: return error: Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "c('FilterModel', 'BaseWrapperModel', 'WrappedModel')" 

1 个答案:

答案 0 :(得分:0)

这是在R中使用XGBoost库的示例:

library("xgboost") # the main algorithm
##Load the Azure workspace. You can find the ID and the pass in your workspace
ws <- workspace(
id = "Your workspace ID",
auth = "Your Auth Pass"
)
##Download the dataset
dataset <- download.datasets(ws, name = "Breast cancer data", quote="\"")
## split the dataset to get train and score data
## 75% of the sample size
smp_size <- floor(0.75 * nrow(dataset))
## set the seed to make your partition reproductible
set.seed(123)
## get index to split the dataset
train_ind <- sample(seq_len(nrow(dataset)), size = smp_size)
##Split train and test data
train_dataset <- dataset[train_ind, ]
test_dataset <- dataset[-train_ind, ]
#Get the features columns
features<-train_dataset[ , ! colnames(train_dataset) %in% c("Class") ]
#get the label column
labelCol <-train_dataset[,c("Class")]
#convert to data matrix
test_gboost<-data.matrix(test_dataset)
train_gboost<-data.matrix(train_dataset)
#train model
bst <- xgboost(data = train_gboost, label = train_dataset$Class, max.depth = 2, eta = 1,
nround = 2, objective = "binary:logistic")
#predict the model
pred <- predict(bst,test_gboost )
#Score model
test_dataset$Scorelabel<-pred
test_dataset$Scoreclasses<- as.factor(as.numeric(pred >= 0.5))
#Create
# Scoring Function
predict_xgboost <- function(new_data){
predictions <- predict(bst, data.matrix(new_data))
output <- data.frame(new_data, ScoredLabels =predictions)
output
}
#Publish the score function
api <- publishWebService(
ws,
fun = predict_xgboost,
name = "xgboost classification",
inputSchema = as.data.frame(as.table(train_gboost)),
data.frame = TRUE)