驱动程序不会在群集模式下停止

时间:2018-04-27 21:43:13

标签: apache-spark hadoop bigdata yarn

我配置了我的群集(1个主/ 9个从属)。 我的问题是,当我提交一个应用程序(通过spark-submitdeploy-mode cluster的单词时),即使数据很少,驱动程序也不会停止。

我提交了这样的应用程序:

./spark-submit \
--class wordCount \
--master spark://master:6066 --deploy-mode cluster --supervise \
--executor-cores 1 --total-executor-cores 3 --executor-memory 1g \      
hdfs://master:9000/user/exemple/word3.jar \
hdfs://master:9000/user/exemple/texte.txt
hdfs://master:9000/user/exemple/result 2

这是我的计划:

import org.apache.spark.SparkContext import
org.apache.spark.SparkContext._ import org.apache.spark.SparkConf

object SparkWordCount {   def main(args: Array[String]) {
  // create Spark context with Spark configuration
  val sc = new SparkContext(new SparkConf().setAppName("Spark Count"))

  // get threshold
  val threshold = args(1).toInt

  // read in text file and split each document into words
  val tokenized = sc.textFile(args(0)).flatMap(_.split(" "))

  // count the occurrence of each word
  val wordCounts = tokenized.map((_, 1)).reduceByKey(_ + _)

  // filter out words with fewer than threshold occurrences
  val filtered = wordCounts.filter(_._2 >= threshold)

  // count characters
  val charCounts = filtered.flatMap(_._1.toCharArray).map((_, 1)).reduceByKey(_ + _)

  System.out.println(charCounts.collect().mkString(", "))   } }

结果: Application Status

0 个答案:

没有答案