如何基于h2o包env创建循环。正确地

时间:2017-01-02 14:19:54

标签: r loops h2o

我想创建一个数据框,它将显示不同种子数和深度学习方法的准确性。我创建了包含两个循环的代码(见下文),但是我得到了一个错误,我怎么能创建这个循环正确地

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 
ERROR MESSAGE:

Can only append one column

附上我的代码:

attach(iris)
train<-iris
test<-iris
invisible(capture.output(h2o.init(nthreads = -1))) # initalising with all cpu cores
trainHex <- as.h2o(train[1:200,])
testHex <- as.h2o(test)
x_names  <- colnames(trainHex[1:4])
SEED<-c(123456789,12345678,1234567)
method<-c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")
Res<-data.frame()

for(i in 1:6){
    for(j in 1:3){

        system.time(ann <- h2o.deeplearning(
            reproducible = TRUE,
            seed = SEED[j],
            x = x_names,
            y = "Species",
            training_frame = trainHex,epochs = 50,
            standardize = TRUE,
            nesterov_accelerated_gradient = T, # for speed
            activation = method[i] 
        ))
        #ann
        testHex$h20<-ifelse(predict(ann,newdata = testHex)>0.5,1,0)
        testHex<-as.data.frame(testHex)
        s<-xtabs(~Species +h20,data=testHex )
        accuracy<-sum(diag(s))/sum(s)
        tmp<-data.frame(seed=SEED[j],method=method[i],result=accuracy)
        Res<-rbind(Res,tmp)

    }
}
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 


ERROR MESSAGE:

Can only append one column

1 个答案:

答案 0 :(得分:1)

您正在进行多项分类;即预测将是三个类别之一。因此h2o.predict()返回4列:

> predict(ann,newdata = testHex)
  |========================================================================================================================================================| 100%
  predict    setosa   versicolor    virginica
1  setosa 0.9999930 7.032604e-06 1.891484e-30
2  setosa 0.9998726 1.274161e-04 2.791200e-28
3  setosa 0.9999923 7.679687e-06 1.101218e-29
4  setosa 0.9999838 1.619749e-05 1.593254e-28
5  setosa 0.9999978 2.150244e-06 7.174795e-31
6  setosa 0.9999932 6.844831e-06 5.511857e-29

[150 rows x 4 columns]

我不完全确定你在做什么,但是考虑到这一点来得到预测:

p = predict(ann,newdata = testHex)

你可以这样做一个1得到正确答案,0得到一个错误:

p$predict == testHex$Species

或者,做客户端:

p = as.data.frame( predict(ann,newdata = testHex) )
p$predict == iris$Species

更一般地说,h2o.grid()更适合尝试替代参数。我认为这可能更接近你的意图:

parts = h2o.splitFrame(as.h2o(iris), 0.8, seed=123)
trainHex = parts[[1]]
testHex = parts[[2]]

g = h2o.grid("deeplearning",
 hyper_params = list(
   seed = c(123456789,12345678,1234567),
   activation = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")
   ),
 reproducible = TRUE,
 x = 1:4,
 y = 5,
 training_frame = trainHex,
 validation_frame = testHex,
 epochs = 1
 )
g  #Output the grid

(我将epochs设置为1只是为了让它快速完成。如果你愿意,可以设置为50.)

我使用splitFrame()作为训练数据使用80%,使用20%作为测试数据;通过将测试数据分配给validation_frame,网格将自动为我们分析这些看不见的数据。