使用cloudml R包中的预测功能

时间:2018-10-08 15:45:57

标签: r tensorflow machine-learning google-cloud-ml

我将R的cloudml软件包与tfestimators软件包结合使用,以使用Tensorflow构建和部署DNN分类器,因此我可以用它来预测二进制响应(“响应”列)。

到目前为止,我已经成功地将模型部署到Google Cloud ML Engine。这并不容易。

我陷入了难题的最后一步-从我的云模型获得预测。具体来说,我需要为“ cloudml_predict”函数的“ instances”参数构造数据的帮助。 This是唯一的包装文档,其中包含我在网上找到的useRs示例,但我认为它不适用于我的情况。

示例数据集:

library(dplyr)

my_data <- data_frame(
    Response = sample(c("Y", "N"), 1000, replace = TRUE),
    Col1 = sample(c("A", "B"), 1000, replace = TRUE),
    Col2 = sample(c("Happy", "Sad"), 1000, replace = TRUE),
    Col3 = sample(1:100, 1000, replace = TRUE),
    col4 = sample(500:1000, 1000, replace = TRUE),
    Col5 = runif(1000, 0, 1)
)

这是我的“ train.R”脚本:

library(tfestimators)
library(caret)        # for even partitioning of dataset

# Create feature columns ----
    FLAGS <- flags(
        flag_numeric("num_epochs", 100)
    )

numeric_cols <- colnames(my_data)[unlist(lapply(my_data, is.numeric))]

cols <- feature_columns(
    column_numeric(numeric_cols),
    column_indicator(
        column_categorical_with_vocabulary_list(
            "Col1",
            vocabulary_list = unique(my_data$Col1),
            dtype = tf$string
        )
    ),
    column_indicator(
        column_categorical_with_vocabulary_list(
            "Col2",
            vocabulary_list = unique(my_data$Col2),
            dtype = tf$string
        )
    ),
    column_indicator(
        column_crossed(
            keys = c("Col1", "Col2"),
            hash_bucket_size = 20L
        )
    )
)



# Split the data into training & test sets ----
set.seed(3456)

inTrain <- createDataPartition(my_data$Response,
                               p = 0.8,
                               list = FALSE,
                               times = 1)

training <- my_data[inTrain, ]
testing  <- my_data[-inTrain, ]

#' Now we need to create an input function with the listing of input and output
#' variables.

pred_fn <- function(data, num_epochs = 10) {
    input_fn(data,
             features = setdiff(names(data), "Response"),
             response = "Response",
             num_epochs = num_epochs)
}

#' Build a deep learning classifier
#' We will create 3 hidden layers, with 80, 40 and 30 nodes, respectively

classifier <- dnn_classifier(
    feature_columns = cols,
    hidden_units = c(80, 40, 30),
    n_classes = 2L,
    label_vocabulary = c("N", "Y"),
    model_dir = "./tmp",
    activation_fn = "relu"
)

classifier %>% tfestimators::train(input_fn = pred_fn(training, num_epochs = FLAGS$num_epochs))
classifier %>% tfestimators::evaluate(input_fn = pred_fn(testing))
classifier %>% predict(input_fn = pred_fn(testing))

#' Help on serving_input_receiver_fn:
#' https://rdrr.io/github/rstudio/tfdeploy/src/tests/testthat/models/tfestimators-example.R
tfestimators::export_savedmodel(classifier,
                                export_dir_base = "savedmodel1",
                                serving_input_receiver_fn = tf$estimator$export$build_parsing_serving_input_receiver_fn(
                                    classifier_parse_example_spec(
                                        feature_columns = cols,
                                        weight_column = NULL,
                                        label_key = "label"
                                    )),
                                as_text = FALSE)

部署脚本:

library(cloudml)

# Train ----
job <- cloudml_train("train.R")

# Find directory of exported trained model
latest_run()$run_dir

# Deploy the model
# Use result from latest_run()$run_dir to populate file path here
cloudml_deploy("runs/cloudml_2018_10_08_113833889/savedmodel1", name = "test_model")


# Prediction (where I need help) ----

instances <- list()
instances[["instances"]] <- list()
instances$instances[["inputs"]] <- apply(testing[1:1, -1], 1, as.list)

cloudml_predict(name = "test_model",
                version = "test_model_1",
                instances = instances,
                verbose = TRUE)

我得到的最接近的是以下错误消息:

Error in cloudml_predict(name = "test_model", version = "test_model_1",  : 
  Prediction failed: Error processing input: Expected string, got... 

0 个答案:

没有答案