我试图从这两个( this和this)线程中解决它,它在我自己的虚拟机上工作但在云数据业务中不起作用。我为他们两个做了同样的过程。但是云中仍然存在错误,这与先前虚拟机中的错误相同。在云上应该做些什么来解决它?
答案 0 :(得分:1)
您是否在这些链接线程中执行了完整的“git clone”步骤?你需要实际修改jblas吗?如果没有,您应该使用--packages org.jblas:jblas:1.2.4
在没有git clone
或mvn install
的情况下将其从maven中心拉出来;在新的Dataproc集群上,以下工作正常:
$ spark-shell --packages org.jblas:jblas:1.2.4
Ivy Default Cache set to: /home/dhuo/.ivy2/cache
The jars for the packages stored in: /home/dhuo/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.jblas#jblas added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.jblas#jblas;1.2.4 in central
downloading https://repo1.maven.org/maven2/org/jblas/jblas/1.2.4/jblas-1.2.4.jar ...
[SUCCESSFUL ] org.jblas#jblas;1.2.4!jblas.jar (605ms)
:: resolution report :: resolve 713ms :: artifacts dl 608ms
:: modules in use:
org.jblas#jblas;1.2.4 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 1 | 1 | 0 || 1 | 1 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
1 artifacts copied, 0 already retrieved (10360kB/29ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
Spark context Web UI available at http://10.240.2.221:4040
Spark context available as 'sc' (master = yarn, app id = application_1501548510890_0005).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.
scala> import org.jblas.DoubleMatrix
import org.jblas.DoubleMatrix
scala> :quit
此外,如果您需要通过Dataproc的作业提交API提交需要“包”的作业,那么因为--packages
实际上是各种Spark启动程序脚本中的语法糖而不是Spark作业的属性,所以在这种情况下,需要使用等效的spark.jars.packages
,例如explained in this StackOverflow answer。