我想创建一个数据框,它将显示不同种子数和深度学习方法的准确性。我创建了包含两个循环的代码(见下文),但是我得到了一个错误,我怎么能创建这个循环正确地
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
Can only append one column
附上我的代码:
attach(iris)
train<-iris
test<-iris
invisible(capture.output(h2o.init(nthreads = -1))) # initalising with all cpu cores
trainHex <- as.h2o(train[1:200,])
testHex <- as.h2o(test)
x_names <- colnames(trainHex[1:4])
SEED<-c(123456789,12345678,1234567)
method<-c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")
Res<-data.frame()
for(i in 1:6){
for(j in 1:3){
system.time(ann <- h2o.deeplearning(
reproducible = TRUE,
seed = SEED[j],
x = x_names,
y = "Species",
training_frame = trainHex,epochs = 50,
standardize = TRUE,
nesterov_accelerated_gradient = T, # for speed
activation = method[i]
))
#ann
testHex$h20<-ifelse(predict(ann,newdata = testHex)>0.5,1,0)
testHex<-as.data.frame(testHex)
s<-xtabs(~Species +h20,data=testHex )
accuracy<-sum(diag(s))/sum(s)
tmp<-data.frame(seed=SEED[j],method=method[i],result=accuracy)
Res<-rbind(Res,tmp)
}
}
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
Can only append one column
答案 0 :(得分:1)
您正在进行多项分类;即预测将是三个类别之一。因此h2o.predict()
返回4列:
> predict(ann,newdata = testHex)
|========================================================================================================================================================| 100%
predict setosa versicolor virginica
1 setosa 0.9999930 7.032604e-06 1.891484e-30
2 setosa 0.9998726 1.274161e-04 2.791200e-28
3 setosa 0.9999923 7.679687e-06 1.101218e-29
4 setosa 0.9999838 1.619749e-05 1.593254e-28
5 setosa 0.9999978 2.150244e-06 7.174795e-31
6 setosa 0.9999932 6.844831e-06 5.511857e-29
[150 rows x 4 columns]
我不完全确定你在做什么,但是考虑到这一点来得到预测:
p = predict(ann,newdata = testHex)
你可以这样做一个1得到正确答案,0得到一个错误:
p$predict == testHex$Species
或者,做客户端:
p = as.data.frame( predict(ann,newdata = testHex) )
p$predict == iris$Species
更一般地说,h2o.grid()
更适合尝试替代参数。我认为这可能更接近你的意图:
parts = h2o.splitFrame(as.h2o(iris), 0.8, seed=123)
trainHex = parts[[1]]
testHex = parts[[2]]
g = h2o.grid("deeplearning",
hyper_params = list(
seed = c(123456789,12345678,1234567),
activation = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")
),
reproducible = TRUE,
x = 1:4,
y = 5,
training_frame = trainHex,
validation_frame = testHex,
epochs = 1
)
g #Output the grid
(我将epochs设置为1只是为了让它快速完成。如果你愿意,可以设置为50.)
我使用splitFrame()
作为训练数据使用80%,使用20%作为测试数据;通过将测试数据分配给validation_frame
,网格将自动为我们分析这些看不见的数据。