环境
** OS平台,发行版和版本:Redhat 7.5 ALT ppc64le Python版本(可选):来自DriverlessAI CUDA / cuDNN的python 3.6 版本:CUDA 9.2 cuDNN 7.1 GPU模型(可选):V100 CPU模型: 可用的POWER9 RAM:512GB R版本:3.4.1 Tensorflow版本: 1.8.0(从源构建)
我试图在Python和R环境中使用DriverlessAI中包含的h2o.deepwater
,而不是DAI的Web GUI。另外,我想使用tensorflow作为后端。
为此,我将环境变量设置为使用DriverlessAI中的python。
$ export PATH=/opt/h2oai/dai/python/bin:$PATH
$ export LD_LIBRARY_PATH=/opt/h2oai/dai/python/lib:/opt/h2oai/dai/lib:$LD_LIBRARY_PATH
$ export PYTHONPATH=/opt/h2oai/dai/cuda-9.2/lib/python3.6/site-packages
这与h2o.deeplearning
一起正常工作。
gpu_xgb <- h2o.deeplearning(x = c("TemperatureCelcius","ExhaustVacuumHg","AmbientPressureMillibar","RelativeHumidity"),
y = "HourlyEnergyOutputMW",
training_frame = train
)
但是,h2o.deepwater
会产生错误或
”无法初始化本地深度学习后端:无后端 找到了。无法建立深水模型。”
以下是与在tensorflow后端在R中运行h2o.deepwater
有关的错误消息。
$ cat t4.R
'# Package Load
library(reticulate)
use_python("/opt/h2oai/dai/python/bin/python")
library(Metrics)
library(h2o)
h2o.init(max_mem_size = "500g")
'# Data Load
df <- read.csv('/data/rpjt/R_script/user/yslee/powerplant_output.csv')
'# Randomly sample 80% of the rows for the training set
set.seed(1)
train_idx <- sample(1:nrow(df), 0.8*nrow(df))
'# h2o Dataset
train <- df[train_idx,]
test <- df[-train_idx,]
train <- as.h2o(train,col.types=c("string"))
test <- as.h2o(test,col.types=c("string"))
'# h2o.deepwater model
gpu_dl <- h2o.deepwater(x = c("TemperatureCelcius","ExhaustVacuumHg","AmbientPressureMillibar","RelativeHumidity"),
y = "HourlyEnergyOutputMW",
training_frame = train,
backend = "tensorflow",
hidden = 10,
standardize =T,
activation = "Tanh",
seed = 1234)
h2o.performance(gpu_dl, newdata = test)
$ Rscript t4.R
...
R is connected to the H2O cluster:
H2O cluster uptime: 16 minutes 30 seconds
H2O cluster timezone: Asia/Seoul
H2O data parsing timezone: UTC
H2O cluster version: 3.20.0.2
H2O cluster version age: 1 month and 22 days
H2O cluster name: dai
H2O cluster total nodes: 1
H2O cluster total memory: 227.37 GB
H2O cluster total cores: 128
H2O cluster allowed cores: 128
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
H2O API Extensions: Algos, MLI, MLI-Driver, AutoML, Core V3, Core V4
R Version: R version 3.4.1 (2017-06-30)
|======================================================================| 100%
|======================================================================| 100%
| | 0%
java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: No backend found. Cannot build a Deep Water model.
java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: No backend found. Cannot build a Deep Water model.
at hex.deepwater.DeepWaterModelInfo.setupNativeBackend(DeepWaterModelInfo.java:267)
at hex.deepwater.DeepWaterModelInfo.(DeepWaterModelInfo.java:214)
at hex.deepwater.DeepWaterModel.(DeepWaterModel.java:227)
at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:131)
at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:118)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:214)
at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:111)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1260)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Error: java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: No backend found. Cannot build a Deep Water model.
Execution halted
install.packages("tensorflow")
和library(tensorflow)
在R中工作正常,
$ ls -l /usr/local/lib64/R/library/tensorflow
total 12
-rw-rw-r-- 1 root root 2456 Aug 7 17:45 DESCRIPTION
drwxrwxr-x 5 root root 112 Aug 7 17:45 examples
drwxrwxr-x 2 root root 125 Aug 7 17:45 help
drwxrwxr-x 2 root root 39 Aug 7 17:45 html
-rw-rw-r-- 1 root root 1095 Aug 7 17:45 INDEX
drwxrwxr-x 2 root root 113 Aug 7 17:45 Meta
-rw-rw-r-- 1 root root 2713 Aug 7 17:45 NAMESPACE
drwxrwxr-x 2 root root 84 Aug 7 17:45 R
此外,tensorflow是从DriverlessAI安装在python中的。
$ which python
/opt/h2oai/dai/python/bin/python
$ pip list | grep tensorflow
tensorflow 1.8.0
答案 0 :(得分:1)
Deep Water不支持Power平台。
(请注意,不建议使用Deep Water;相反,建议人们直接使用Keras。)