从多线程驱动程序启动Apache Spark SQL作业

时间:2017-12-16 02:37:03

标签: java multithreading scala apache-spark apache-spark-2.0

我想用Spark从大约1500个远程Oracle表中提取数据,我希望有一个多线程应用程序,每个线程获取一个表,或者每个线程可能有10个表,并启动一个spark作业来读取他们的各自的表格。

从官方火花网站https://spark.apache.org/docs/latest/job-scheduling.html可以看出,这显然可行......

  

... Spark运行的集群管理器提供跨应用程序进行调度的工具。其次,在每个Spark应用程序中,如果多个“作业”(Spark动作)由不同的线程提交,则它们可以同时运行。如果您的应用程序通过网络提供请求,这种情况很常见。 Spark包含一个公平的调度程序来调度每个SparkContext中的资源。

但是你可能已经在这篇SO Concurrent job Execution in Spark中注意到这个类似的问题没有被接受的答案,最受欢迎的答案始于

  

这不符合Spark的精神

  1. 每个人都知道它不在Spark的“精神”中
  2. 谁在乎什么是Spark的精神?这实际上并不意味着什么
  3. 有没有人之前有这样的工作?你有什么特别的事吗?在我浪费了大量的工作时间进行原型设计之前,我只想要一些指示。我真的很感激任何帮助!

3 个答案:

答案 0 :(得分:4)

spark上下文是线程安全的,因此可以从多个线程并行调用它。 (我在制作中这样做)

要注意的一件事是限制你运行的线程数,因为:
1.执行程序内存在所有线程之间共享,您可能会获得OOM或不断地从缓存中交换内存 2. cpu是有限的,因此拥有比核心更多的任务不会有任何改进

答案 1 :(得分:2)

您不需要在一个多线程应用程序中提交您的作业(尽管我确实认为您没有理由不这样做)。只需将您的工作作为单独的流程提交有一个脚本一次提交一个所有这些作业并将该过程推送到后台,或以纱线群集模式提交。 您的调度程序(yarn,mesos,spark cluster)只会让您的某些作业等待,因为根据内存和/或CPU可用性,所有调度程序都无法同时运行。

请注意,如果您真正使用多个分区处理表,我只会看到您的方法的好处 - 而不仅仅是我多次看过的一个。另外因为你需要处理那么多表,我不知道多少 - 如果有的话 - 你会受益。根据您对表数据的处理方式,可能更简单的是只运行多个单线程和非火花作业。

另见@cowbert他的笔记。

答案 2 :(得分:1)

同意@lev,我一直想知道它很久,所以我写了一个简单的小代码来确保它能正常工作,请注意!为了控制每个驱动程序的工人数量,您需要通过合并限制数据帧/集。

这是示例代码:

import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession

object SparkMultiThreadExample extends App{

  val TOTAL_WORKERS = 10
  val NUMBER_OF_WORKERS_PER_DRIVER = 2
  val sparkConf = new SparkConf()
  sparkConf.setMaster(s"local[${TOTAL_WORKERS}]")
  val spark = SparkSession.builder().config(sparkConf).getOrCreate()
  val list1 = (0 until 10).toList
  import spark.implicits._
  list1.par.foreach(t => {
    spark.createDataset(list1).coalesce(NUMBER_OF_WORKERS_PER_DRIVER).foreach(i =>   {
  println(s"${Thread.currentThread()}, Driver thread ${t}: This is inside worker ${i} " )
  Thread.sleep(1000)
  println(s"FINISH ${Thread.currentThread()} Driver thread ${t}: This is inside worker ${i} " )
})
}) }

输出:

Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 0 
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 0 
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 5 
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 5 
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 5 
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 0 
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 0 
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 5 
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 5 
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 0 
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 0 
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 5 
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 0 
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 5 
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 5 
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 6 
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 1 
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 6 
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 1 
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 0 
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 5 
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 5 
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 0 
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 6 
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 0 
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 1 
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 6 
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 2 
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 1 
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 6 
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 2 
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 1 
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 6 
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 6 
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 1 
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 1 
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 7 
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 8 
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 3 
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 8 
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 3 
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 2 
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 7 
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 8 
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 2 
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 3 
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 8 
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 4 
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 9 
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 4 
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 8 
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 3 
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 8 
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 3 
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 9 
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 4 
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 4 
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 5 
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 0 
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 0 
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 5 
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 0 
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 5 
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 0 
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 5 
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 5 
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 0 
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 5 
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 0 
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 0 
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 5 
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 0 
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 5 
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 0 
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 5 
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 5 
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 6 
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 0 
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 1 
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 6 
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 1 
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 1 
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 6 
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 1 
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 6 
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 1 
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 6 
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 6 
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 7 
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 1 
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 2 
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 7 
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 2 
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 2 
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 7 
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 2 
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 7 
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 2 
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 7 
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 7 
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 8 
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 2 
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 3 
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 8 
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 3 
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 3 
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 8 
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 3 
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 8 
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 3 
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 8 
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 8 
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 3 
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 4 
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 9 
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 4