我在使用需要2个罐子的应用程序时使用hdinsight spark集群时遇到了问题,
第一个是JNR(com.github.jnr:jnr-constants:0.9.0)
另一个是JNA(net.java.dev.jna:jna:4.1.0),这是我使用的jruby所必需的。
我遇到的问题是每当我运行我的应用程序时都会出现此错误:
[Error] Exception java.lang.NoSuchMethodError : jnr.constants.platform.OpenFlags.defined()Z
如果我删除了调用jnr
的代码,我对jna也有同样的问题my.process.check.run.checkRun$.main(checkRun.scala:219): [Error] Exception java.lang.NoSuchMethodError : com.sun.jna.Platform.is64Bit()Z
(is64Bit()Z
功能在jna v3.5.1上不可用)
我检查了我只有这个的工人:
myClusterUser@wn0-Test:/$ find . -name '*jna*.jar' 2>/dev/null
./usr/lib/hdinsight-scpnet/scp/jvm/jna-3.5.1.jar
./usr/hdp/2.4.2.0-258/storm/extlib/jna-3.5.1.jar
myClusterUser@wn0-Test:/$ find . -name '*jnr*.jar' 2>/dev/null
myClusterUser@wn0-Test:/$
在头上我有这个:
myClusterUser@hn0-Test:/$ find . -name '*jna*.jar' 2>/dev/null
./usr/lib/hdinsight-scpnet/scp/jvm/jna-3.5.1.jar
./usr/hdp/2.4.2.0-258/storm/extlib/jna-3.5.1.jar
myClusterUser@hn0-Test:/$ find . -name '*jnr*.jar' 2>/dev/null
myClusterUser@hn0-Test:/$
我尝试使用--packages option
spark-submit \
--verbose \
--packages net.java.dev.jna:jna:4.1.0,com.github.jnr:jnr- constants:0.9.0,org.jruby:jruby:9.0.1.0,com.databricks:spark-csv_2.10:1.4.0 \
--conf spark.executor.extraClassPath=./ \
--conf spark.driver.maxResultSize=2g \
--conf spark.executor.memory=1500m \
--conf spark.yarn.executor.memoryOverhead=500 \
--conf spark.executor.instances=2 \
--conf spark.sql.shuffle.partitions=4 \
--conf 'spark.executor.extraJavaOptions=-XX:PermSize=512M -XX:MaxPermSize=512M' \
--conf 'spark.driver.extraJavaOptions=-XX:PermSize=512M -XX:MaxPermSize=512M' \
--deploy-mode cluster \
--master yarn-cluster \
--class my.process.check.run.checkRun \
wasb:///checkRun/my-checkRun-1.0.6-SNAPSHOT-jar-with-dependencies.jar \
--nostdin \
--nodb \
--LOG_LEVEL 0
permsize选项用于避免内存不足问题,因为hdinsight使用的是java 7而不是java 8。
当我这样做时,我可以看到每个工作人员都在yarn / local / filecache上复制 my-checkRun-1.0.6-SNAPSHOT-jar-with-dependencies.jar
myClusterUser@wn3-Test:/$ find . -name '*checkRun*.jar' 2>/dev/null
./mnt/resource/hadoop/yarn/local/filecache/10/my-checkRun-1.0.6-SNAPSHOT-jar-with-dependencies.jar
并且这个文件夹除了这个jar之外别无其他。
我还看到spark提交检索我在--packages
选项上指定的jar的版本,将它们存储在本地存储库m2
然后将它们放在一个临时的WSAB(hdfs)文件夹旁边,并附带一个spark conf存档,
temporary storage during run
在这个档案中,我有 spark_conf .properties
#Spark configuration.
#Mon Jul 11 13:18:01 UTC 2016
spark.executor.memory=1500m
spark.yarn.submit.file.replication=3
spark.yarn.jar=local\:///usr/hdp/current/spark-client/lib/spark-assembly.jar
spark.yarn.executor.memoryOverhead=500
spark.yarn.driver.memoryOverhead=384
spark.history.kerberos.keytab=none
spark.submit.deployMode=cluster
spark.yarn.secondary.jars=net.java.dev.jna_jna-4.1.0.jar,com.github.jnr_jnr-constants-0.9.0.jar
spark.yarn.scheduler.heartbeat.interval-ms=5000
spark.yarn.preserve.staging.files=false
spark.eventLog.enabled=true
spark.executor.extraClassPath=./
spark.yarn.queue=default
spark.history.provider=org.apache.spark.deploy.history.FsHistoryProvider
spark.history.ui.port=18080
spark.yarn.historyServer.address=hn0-testr.su4ft5rezscepaqpicvo04xrkb.fx.internal.cloudapp.net\:18080
spark.master=yarn-cluster
spark.yarn.containerLauncherMaxThreads=25
spark.executor.cores=2
spark.yarn.max.executor.failures=3
spark.yarn.services=
spark.history.fs.logDirectory=wasb\:///hdp/spark-events
spark.sql.shuffle.partitions=4
spark.executor.extraJavaOptions=-XX\:PermSize\=512M -XX\:MaxPermSize\=512M
spark.executor.instances=2
spark.app.name=my.process.check.run.checkRun
spark.driver.maxResultSize=2g
spark.history.kerberos.principal=none
spark.driver.extraJavaOptions=-XX\:PermSize\=512M -XX\:MaxPermSize\=512M
spark.eventLog.dir=wasb\:///hdp/spark-events
如您所见,我在spark.yarn.secondary.jars
参数上列出了我的其他广告。
运行后,我可以在头节点上找到更多的jna和jnr(工作节点上没有任何变化)
myClusterUser@hn0-Test:/$ find . -name '*jnr*.jar' 2>/dev/null
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-netdb/jars/jnr-netdb-1.1.4.jar
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-posix/jars/jnr-posix-3.0.15.jar
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-x86asm/jars/jnr-x86asm-1.0.2.jar
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-enxio/jars/jnr-enxio-0.9.jar
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-unixsocket/jars/jnr-unixsocket-0.8.jar
./home/myClusterUser/.ivy2/cache/com.github.jnr/jnr-constants/jars/jnr-constants-0.9.0.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jffi-1.2.9.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-constants-0.9.0.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-enxio-0.9.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-x86asm-1.0.2.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-netdb-1.1.4.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-posix-3.0.15.jar
./home/myClusterUser/.ivy2/jars/com.github.jnr_jnr-unixsocket-0.8.jar
myClusterUser@hn0-Test:/$ find . -name '*jna*.jar' 2>/dev/null
./usr/lib/hdinsight-scpnet/scp/jvm/jna-3.5.1.jar
./usr/hdp/2.4.2.0-258/storm/extlib/jna-3.5.1.jar
./home/myClusterUser/.ivy2/cache/net.java.dev.jna/jna/jars/jna-4.1.0.jar
./home/myClusterUser/.ivy2/jars/net.java.dev.jna_jna-4.1.0.jar
--packages
选项中包含的所有jar,以确保与之没有冲突。有没有人知道我应该怎么做才能使用我提供的JNA和JNR罐子来运行?