在AWS中,我按照here中的说明使用社区AMI ami-97591381(h2o版本:3.13.0.356)启动了g2.2xlarge EC2。
这是我的代码,您可以在我将S3链接公开时运行:
library(h2o)
library(jsonlite)
library(curl)
localH2O = h2o.init()
df.truth <- h2o.importFile("https://s3.amazonaws.com/nw.data.test.us.east/df.truth.zeroed", header = T, sep=",")
df.truth$isFemale <- h2o.asfactor(df.truth$isFemale)
hotnames.truth <- fromJSON("https://s3.amazonaws.com/nw.data.test.us.east/hotnames.json", simplifyVector = T)
# Training and validation sets
splits <- h2o.splitFrame(df.truth, c(0.9), seed=1234)
train.truth <- h2o.assign(splits[[1]], "train.truth.hex")
valid.truth <- h2o.assign(splits[[2]], "valid.truth.hex")
# Train a model using non-GPU deeplearning
dl.2 <- h2o.deeplearning(
training_frame = train.truth, model_id="dl.2",
validation_frame = valid.truth,
x=setdiff(hotnames.truth[1:(length(hotnames.truth)/2)], c("isFemale", "nwtcs")),
y="isFemale", stopping_metric = "AUTO", seed = 1,
sparse = F, mini_batch_size = 20)
# Train a model using GPU-enabled deepwater
dw.2 <- h2o.deepwater(
training_frame = train.truth, model_id="dw.2",
validation_frame = valid.truth,
x=setdiff(hotnames.truth[1:(length(hotnames.truth)/2)], c("isFemale", "nwtcs")),
y="isFemale", stopping_metric = "AUTO", seed = 1,
sparse = F, mini_batch_size = 20)
当我检查这两个模型时,令我惊讶的是我在logloss中看到了大的差异:
print(dl.2)
Model Details:
==============
H2OBinomialModel: deeplearning
Model ID: dl.2
Status of Neuron Layers: predicting isFemale, 2-class classification, bernoulli distribution, CrossEntropy loss, 160,802 weights/biases, 2.0 MB, 1,041,465 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum
1 1 600 Input 0.00 %
2 2 200 Rectifier 0.00 % 0.000000 0.000000 0.104435 0.102760 0.000000
3 3 200 Rectifier 0.00 % 0.000000 0.000000 0.031395 0.055490 0.000000
4 4 2 Softmax 0.000000 0.000000 0.001541 0.001438 0.000000
mean_weight weight_rms mean_bias bias_rms
1
2 0.018904 0.144034 0.150630 0.415525
3 -0.023333 0.081914 0.545394 0.251275
4 0.029091 0.295439 -0.004396 0.357609
H2OBinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 9877 samples **
MSE: 0.1213733
RMSE: 0.3483868
LogLoss: 0.388214
Mean Per-Class Error: 0.2563669
AUC: 0.8433182
Gini: 0.6866365
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
0 1 Error Rate
0 6546 1079 0.141508 =1079/7625
1 836 1416 0.371226 =836/2252
Totals 7382 2495 0.193885 =1915/9877
H2OBinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE: 0.126671
RMSE: 0.3559087
LogLoss: 0.4005941
Mean Per-Class Error: 0.2585051
AUC: 0.8309913
Gini: 0.6619825
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
0 1 Error Rate
0 11746 3134 0.210618 =3134/14880
1 1323 2995 0.306392 =1323/4318
Totals 13069 6129 0.232160 =4457/19198
print(dw.2)
Model Details:
==============
H2OBinomialModel: deepwater
Model ID: dw.2b
Status of Deep Learning Model: MLP: [200, 200], 630.8 KB, predicting isFemale, 2-class classification, 1,708,160 training samples, mini-batch size 20
input_neurons rate momentum
1 600 0.000369 0.900000
H2OBinomialMetrics: deepwater
** Reported on training data. **
** Metrics reported on temporary training frame with 9877 samples **
MSE: 0.1615781
RMSE: 0.4019677
LogLoss: 0.629549
Mean Per-Class Error: 0.3467246
AUC: 0.7289561
Gini: 0.4579122
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
0 1 Error Rate
0 4843 2782 0.364852 =2782/7625
1 740 1512 0.328597 =740/2252
Totals 5583 4294 0.356586 =3522/9877
H2OBinomialMetrics: deepwater
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE: 0.1651776
RMSE: 0.4064205
LogLoss: 0.6901861
Mean Per-Class Error: 0.3476629
AUC: 0.7187362
Gini: 0.4374724
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
0 1 Error Rate
0 8624 6256 0.420430 =6256/14880
1 1187 3131 0.274896 =1187/4318
Totals 9811 9387 0.387697 =7443/19198
如上所示,非GPU和GPU模型之间的logloss差异很大:
Logloss
+----------------------------------+
| | non-GPU | GPU |
+----------------------------------+
| training data | 0.39 | 0.63 |
+----------------------------------|
| validation data | 0.40 | 0.69 |
+----------------------------------+
据我所知,由于训练的随机性,我会得到不同的结果,但我不会指望非GPU和GPU之间存在如此巨大的差异。
答案 0 :(得分:1)
h2o.deeplearning
是H2O的内置深度学习算法。它很好地并行化,适用于大数据,但不使用GPU。
h2o.deepwater
是(可能)Tensorflow的包装器,并且(可能)使用你的GPU(但它可以使用CPU,它可以使用不同的后端)。
换句话说,这与使用CPU或使用GPU没有区别:您正在使用两种不同的深度学习实现。
顺便说一句,我建议你增加epochs
的数量(从默认值10增加到200个 - 记住这意味着运行时间要长20倍),看看是否差异仍然存在。或者比较得分历史图表,看看Tensorflow是否到达那里,但只需要50%的纪元来获得相同的logloss得分。