h2o deeplearning检查点模型

时间:2015-10-02 20:43:07

标签: deep-learning h2o checkpointing

民间,

当尝试从检查点模型恢复h2o深度学习并提供验证框时,我遇到了一些问题。它说“验证数据集必须与检查指向模型相同”,我相信我确实拥有相同的验证数据集。如果我将validation_frame留空,则检查点模型工作正常。我附上下面的代码:

localh2o <- h2o.init(nthreads = -1)
train_image.hex <- read.csv("mnist_train.csv",header=FALSE)
train_image.hex[,785] <- factor(train_image.hex[,785])
train_image.hex <- as.h2o(train_image.hex)
test_image.hex <- read.csv("mnist_test.csv",header=FALSE)
test_image.hex[,785] <- factor(test_image.hex[,785])
test_image.hex <- as.h2o(test_image.hex)


mnist_model <- h2o.deeplearning(x=1:784, y = 785,
training_frame= train_image.hex, 
validation_frame = test_image.hex,
activation = "RectifierWithDropout", hidden = c(500,1000),
input_dropout_ratio = 0.2,
hidden_dropout_ratios = c(0.5,0.5), adaptive_rate=TRUE,
rho=0.98, epsilon = 1e-7,
l1 = 1e-8, l2 = 1e-7, max_w2 = 10, 
epochs = 10, export_weights_and_biases = TRUE,
variable_importances = FALSE
)
h2o.saveModel(mnist_model, path="/tmp",force=TRUE)

然后我关闭h2o,退出R并在R中重新启动h2o以恢复训练,其中h2o错误输出:

localh2o <- h2o.init(nthreads = -1)
train_image.hex <- read.csv("mnist_train.csv",header=FALSE)
train_image.hex[,785] <- factor(train_image.hex[,785])
train_image.hex <- as.h2o(train_image.hex)
test_image.hex <- read.csv("mnist_test.csv",header=FALSE)
test_image.hex[,785] <- factor(test_image.hex[,785])
test_image.hex <- as.h2o(test_image.hex)
startmodel <- h2o.loadModel("/tmp/DeepLearning_model_R_1443812402059_20", localh2o)

mnist_model <- h2o.deeplearning(x=1:784, y = 785,
checkpoint = startmodel@model_id,
training_frame= train_image.hex, 
validation_frame = test_image.hex,
activation = "RectifierWithDropout", hidden = c(500,1000),
input_dropout_ratio = 0.2,
hidden_dropout_ratios = c(0.5,0.5), adaptive_rate=TRUE,
rho=0.98, epsilon = 1e-7,
l1 = 1e-8, l2 = 1e-7, max_w2 = 10, 
epochs = 10, export_weights_and_biases = TRUE,
variable_importances = FALSE
)

2 个答案:

答案 0 :(得分:0)

感谢您向我们指出这一点。我添加了一个JIRA,您可以在此处跟踪其进度:https://0xdata.atlassian.net/browse/PUBDEV-2182

您可以尽快解决问题。

谢谢!

主治医生

答案 1 :(得分:0)

请使用最新版本重试。这应该现在有效。