Question

我配置了我的群集（1个主/ 9个从属）。我的问题是，当我提交一个应用程序（通过spark-submit和deploy-mode cluster的单词时），即使数据很少，驱动程序也不会停止。

我提交了这样的应用程序：

./spark-submit \
--class wordCount \
--master spark://master:6066 --deploy-mode cluster --supervise \
--executor-cores 1 --total-executor-cores 3 --executor-memory 1g \      
hdfs://master:9000/user/exemple/word3.jar \
hdfs://master:9000/user/exemple/texte.txt
hdfs://master:9000/user/exemple/result 2

这是我的计划：

import org.apache.spark.SparkContext import
org.apache.spark.SparkContext._ import org.apache.spark.SparkConf

object SparkWordCount {   def main(args: Array[String]) {
  // create Spark context with Spark configuration
  val sc = new SparkContext(new SparkConf().setAppName("Spark Count"))

  // get threshold
  val threshold = args(1).toInt

  // read in text file and split each document into words
  val tokenized = sc.textFile(args(0)).flatMap(_.split(" "))

  // count the occurrence of each word
  val wordCounts = tokenized.map((_, 1)).reduceByKey(_ + _)

  // filter out words with fewer than threshold occurrences
  val filtered = wordCounts.filter(_._2 >= threshold)

  // count characters
  val charCounts = filtered.flatMap(_._1.toCharArray).map((_, 1)).reduceByKey(_ + _)

  System.out.println(charCounts.collect().mkString(", "))   } }

结果： Application Status

驱动程序不会在群集模式下停止

0 个答案: