我有以下data.table
。
fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH
1: 7.0 0.27 0.36 20.7 0.045 45 170 1.00100 3.00
2: 6.3 0.30 0.34 1.6 0.049 14 132 0.99400 3.30
3: 8.1 0.28 0.40 6.9 0.050 30 97 0.99510 3.26
4: 7.2 0.23 0.32 8.5 0.058 47 186 0.99560 3.19
5: 7.2 0.23 0.32 8.5 0.058 47 186 0.99560 3.19
sulphates alcohol quality
1: 0.45 8.8 Bad wine
2: 0.49 9.5 Bad wine
3: 0.44 10.1 Bad wine
4: 0.40 9.9 Bad wine
5: 0.40 9.9 Bad wine
我可以跑
system.time(model_glm <- h2o.glm(x = 1:11, y = 12, training_frame = wine.train.h2o,
validation_frame = wine.test.h2o, seed = 42,
family = "binomial"))
训练这个数据集的glm。后来为了获得部分依赖图,我可以使用
glm_pp <- rbindlist(lapply(glm_pp, function(x){melt(x, id.vars="mean_response")}))
ggplot(glm_pp, aes(x=value, y=mean_response)) + geom_point() + facet_wrap(~variable, scale="free_x") +
geom_smooth(method="loess") + theme_pl() + ggtitle("Partial dependence plot")
就我而言,我的y
是quality
,即binary variable
。
如果我的dependent variable
有3个或更多类别,我怎么能得到部分依赖关系图,所以如果我使用family = multinomial
运行glm?
答案 0 :(得分:2)
目前,H2O在其部分依赖实现中支持二项式和回归模型。多项模型尚不兼容。
-Nav