Question

我正在使用h2o.deeplearning建立泊松模型，因此我在抵消曝光权重（将来我也需要使用其他偏移值）。因为它们是小于1的数字的对数，所以所有偏移值均为负。

我在评分时注意到，没有使用负偏移。

我正在CRAN上使用最新的h2o版本3.22.1.1。

R版本3.6.0。

可复制的示例：

# Setup
set.seed(1234)
install.packages("h2o")
install.packages("insuranceData")
library(h2o)
library(insuranceData)
h2o.init()
h2o.getVersion()  # "3.22.1.1

data(dataCar)
dataCar$log_exposure <- log(dataCar$exposure)
# log_exposure contains negative values
summary(dataCar$log_exposure)

# Build model
dataCar.h2o <- as.h2o(dataCar)
dl <- h2o.deeplearning(x = c("veh_value", "veh_body", "veh_age", "gender", "area", "agecat"),
                       y = "numclaims",
                       training_frame = dataCar.h2o,
                       nfolds = 5,
                       offset_column = "log_exposure",
                       distribution = "poisson",
                       reproducible = TRUE)

# Prediction Scenarios
dataCar_offsetadj <- dataCar[1,]

# Offset 0
dataCar_offsetadj$log_exposure <- 0
pred1 <- h2o.predict(dl, as.h2o(dataCar_offsetadj))
print(pred1)
# 0.1058487

# Offset 1.5
dataCar_offsetadj$log_exposure <- 1.5
pred2 <- h2o.predict(dl, as.h2o(dataCar_offsetadj))
print(pred2)
# 0.4743808  # Different value, as expected

# Offset -1.5
dataCar_offsetadj$log_exposure <- -1.5
pred3 <- h2o.predict(dl, as.h2o(dataCar_offsetadj))
print(pred3)
# 0.1058487  # Same value as offset 0

# Offset -2
dataCar_offsetadj$log_exposure <- -2
pred4 <- h2o.predict(dl, as.h2o(dataCar_offsetadj))
print(pred4)
# 0.1058487  # again, same value

all.equal(pred1, pred3)  # TRUE
all.equal(pred1, pred4)  # TRUE

我希望负偏移量的预测都不同。正偏移具有与预期不同的分数。

似乎h2o没有应用负偏移量。它是否正确？这样做的原因是什么？这是否还意味着在创建模型时不使用负偏移量？我如何解决这个问题？将来，我将需要构建更多具有不同分布和负偏移的深度学习模型。

感谢任何建议，谢谢。

h2o.deeplearning不使用负偏移

0 个答案: