为什么我在h2o和caret :: R2()中的测试数据集中的预测数据与实际数据之间获得了不同的R2?

时间:2018-11-30 06:05:55

标签: r h2o

我想在测试数据集中的预测数据与实际数据之间获得R2,为什么h2o.performance(m,test)的结果不同于caret :: R2()或“ lm”模型?

'h2o.performance(m,test)'为0.733401,'caret :: R2(p,a)'为0.7577784  summary(lmm)$ r.squared与'caret :: R2(p,a)'相同

示例代码:

library(h2o)

h <- h2o.init()
data <- as.h2o(iris)
part <- h2o.splitFrame(data, 0.7, seed = 123)
train <- part[[1]]
test <- part[[2]]

m <- h2o.glm(x=2:5,y=1,train, nfolds = 10, seed = 123)

summary(m)
predictions <- h2o.predict(m, test)

p <- as.data.frame(predictions)
a <- as.data.frame(test[1])
caret::R2(p,  a)
# 0.7577784
h2o.performance(m,  test)
# the R^2 is 0.733401
df <- data.frame(p=p, a=a)
lmm <- lm(predict ~ Sepal.Length, data =df)
summary(lmm)$r.squared
# the r.squared is 0.7577784

1 个答案:

答案 0 :(得分:1)

您可以获得以下训练指标:

m <- h2o.glm(x=2:5,y=1,train,validation_frame = test)


  #We would ideally use a validation set. 

h2o.performance(m,test)
m@model$training_metrics