我将R的cloudml软件包与tfestimators软件包结合使用,以使用Tensorflow构建和部署DNN分类器,因此我可以用它来预测二进制响应(“响应”列)。
到目前为止,我已经成功地将模型部署到Google Cloud ML Engine。这并不容易。
我陷入了难题的最后一步-从我的云模型获得预测。具体来说,我需要为“ cloudml_predict”函数的“ instances”参数构造数据的帮助。 This是唯一的包装文档,其中包含我在网上找到的useRs示例,但我认为它不适用于我的情况。
示例数据集:
library(dplyr)
my_data <- data_frame(
Response = sample(c("Y", "N"), 1000, replace = TRUE),
Col1 = sample(c("A", "B"), 1000, replace = TRUE),
Col2 = sample(c("Happy", "Sad"), 1000, replace = TRUE),
Col3 = sample(1:100, 1000, replace = TRUE),
col4 = sample(500:1000, 1000, replace = TRUE),
Col5 = runif(1000, 0, 1)
)
这是我的“ train.R”脚本:
library(tfestimators)
library(caret) # for even partitioning of dataset
# Create feature columns ----
FLAGS <- flags(
flag_numeric("num_epochs", 100)
)
numeric_cols <- colnames(my_data)[unlist(lapply(my_data, is.numeric))]
cols <- feature_columns(
column_numeric(numeric_cols),
column_indicator(
column_categorical_with_vocabulary_list(
"Col1",
vocabulary_list = unique(my_data$Col1),
dtype = tf$string
)
),
column_indicator(
column_categorical_with_vocabulary_list(
"Col2",
vocabulary_list = unique(my_data$Col2),
dtype = tf$string
)
),
column_indicator(
column_crossed(
keys = c("Col1", "Col2"),
hash_bucket_size = 20L
)
)
)
# Split the data into training & test sets ----
set.seed(3456)
inTrain <- createDataPartition(my_data$Response,
p = 0.8,
list = FALSE,
times = 1)
training <- my_data[inTrain, ]
testing <- my_data[-inTrain, ]
#' Now we need to create an input function with the listing of input and output
#' variables.
pred_fn <- function(data, num_epochs = 10) {
input_fn(data,
features = setdiff(names(data), "Response"),
response = "Response",
num_epochs = num_epochs)
}
#' Build a deep learning classifier
#' We will create 3 hidden layers, with 80, 40 and 30 nodes, respectively
classifier <- dnn_classifier(
feature_columns = cols,
hidden_units = c(80, 40, 30),
n_classes = 2L,
label_vocabulary = c("N", "Y"),
model_dir = "./tmp",
activation_fn = "relu"
)
classifier %>% tfestimators::train(input_fn = pred_fn(training, num_epochs = FLAGS$num_epochs))
classifier %>% tfestimators::evaluate(input_fn = pred_fn(testing))
classifier %>% predict(input_fn = pred_fn(testing))
#' Help on serving_input_receiver_fn:
#' https://rdrr.io/github/rstudio/tfdeploy/src/tests/testthat/models/tfestimators-example.R
tfestimators::export_savedmodel(classifier,
export_dir_base = "savedmodel1",
serving_input_receiver_fn = tf$estimator$export$build_parsing_serving_input_receiver_fn(
classifier_parse_example_spec(
feature_columns = cols,
weight_column = NULL,
label_key = "label"
)),
as_text = FALSE)
部署脚本:
library(cloudml)
# Train ----
job <- cloudml_train("train.R")
# Find directory of exported trained model
latest_run()$run_dir
# Deploy the model
# Use result from latest_run()$run_dir to populate file path here
cloudml_deploy("runs/cloudml_2018_10_08_113833889/savedmodel1", name = "test_model")
# Prediction (where I need help) ----
instances <- list()
instances[["instances"]] <- list()
instances$instances[["inputs"]] <- apply(testing[1:1, -1], 1, as.list)
cloudml_predict(name = "test_model",
version = "test_model_1",
instances = instances,
verbose = TRUE)
我得到的最接近的是以下错误消息:
Error in cloudml_predict(name = "test_model", version = "test_model_1", :
Prediction failed: Error processing input: Expected string, got...