在Jupyter scala笔记本中运行h2o

时间:2016-06-24 07:23:26

标签: scala jupyter h2o

我正试图在带有scala内核的Jupyter笔记本上运行h2o,到目前为止没有成功。也许有人可以给我一些可能出错的提示?我正在执行的代码是

classpath.add("ai.h2o" % "sparkling-water-core_2.10" % "1.6.5")

import org.apache.spark.h2o._
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._

val conf = new SparkConf().setAppName("appName").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val h2oContext = new H2OContext(sc).start()

它在最后一行失败,错误

java.lang.NoClassDefFoundError: water/H2O
....

打印出异常

java.lang.RuntimeException: Cannot launch H2O on executors: numOfExecutors=1, executorStatus=(driver,false) (Cannot launch H2O on executors: numOfExecutors=1, executorStatus=(driver,false))
org.apache.spark.h2o.H2OContextUtils$.startH2O(H2OContextUtils.scala:169)
org.apache.spark.h2o.H2OContext.start(H2OContext.scala:214)

1 个答案:

答案 0 :(得分:2)

如果你使用Toree,

在/usr/local/share/jupyter/kernels/apache_toree_scala/kernel.json

你应该在__TOREE_SPARK_OPTS__上添加--packages ai.h2o:sparkling-water-core_2.10:1.6.6,类似

“__ TOREE_SPARK_OPTS__”:“ - master local [*] - executor-memory 12g --driver-memory 12g --packages ai.h2o:sparkling-water-core_2.10:1.6.6”,

然后,在创建笔记本时创建sc。所以你不需要重新创建sc。