我在scala中启动EMR 5.3.1的sparksession时收到sparkcontext错误。下面是我正在使用的火花版本和错误。 这在Windows机器上工作正常但在EMR上出错。这也是创造火花的正确方法吗?
spark_version:2.1.0
val spark = SparkSession
.builder
.master("local[*]")
.appName("vierweship_test")
.config("spark.sql.warehouse.dir", "target/spark-warehouse")
//.enableHiveSupport()
.getOrCreate()
错误:
17/08/01 13:08:14 ERROR SparkContext: Error initializing SparkContext.
java.io.IOException: Incomplete HDFS URI, no host: hdfs:///var/log/spark/apps
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:143)
如果我不使用仓库目录属性,我会收到以下错误。
17/08/01 13:00:02 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at java.io.File.<init>(File.java:277)
at org.apache.spark.deploy.yarn.Client.addDistributedUri$1(Client.scala:438)
at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:476)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11$$anonfun$apply$8.apply(Client.scala:600)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11$$anonfun$apply$8.apply(Client.scala:599)
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
我正在使用的命令:
spark-submit --verbose --class xxx --master yarn --jars="s3-dist-cp.jar:common-0.1.jar" --deploy-mode client --packages "Xxx:xxx:XXx" myjar-0.1.jar