使用spark启动器时,将参数传递给jar

时间:2018-02-06 11:29:55

标签: apache-spark jar spark-launcher

我正在尝试创建一个可执行jar,它使用spark launcher运行另一个带有数据转换任务的jar(这个jar创建了spark会话)。

我需要将java参数(一些java数组)传递给启动程序执行的jar。

object launcher {
  @throws[Exception]
  // How do I pass parameters to spark_job_with_spark_session.jar
  def main(args: Array[String]): Unit = {
    val handle = new SparkLauncher()
      .setAppResource("spark_job_with_spark_session.jar")
      .setVerbose(true)
      .setMaster("local[*]")
      .setConf(SparkLauncher.DRIVER_MEMORY, "4g")
      .launch()
  }
}

我该怎么做?

2 个答案:

答案 0 :(得分:2)

  

需要传递java参数(一些java数组)

它等同于执行spark-submit,因此您无法直接传递Java对象。使用app args

addAppArgs(String... args)

传递应用程序参数,并在您的应用程序中解析它们。

答案 1 :(得分:0)

/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package com.vng.zing.zudm_ml_feature_store_spark_launcher.app;

import com.vng.zing.zudm_ml_feature_store_spark_launcher.common.TaskListener;
import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;

/**
 *
 * @author hahattpro
 */
public class ExampleSparkLauncherApp {

    public static void main(String[] args) throws Exception {
        SparkAppHandle handle = new SparkLauncher()
                .setAppResource("/home/cpu11453/workplace/experiment/SparkPlayground/target/scala-2.11/SparkPlayground-assembly-0.1.jar")
                .setMainClass("me.thaithien.playground.ConvertToCsv")
                .setMaster("spark://cpu11453:7077")
                .setConf(SparkLauncher.DRIVER_MEMORY, "3G")
                .addAppArgs("--input" , "/data/download_hdfs/data1/2019_08_13/00/", "--output", "/data/download_hdfs/data1/2019_08_13/00_csv_output/")
                .startApplication(new TaskListener());

        handle.addListener(new SparkAppHandle.Listener() {
            @Override
            public void stateChanged(SparkAppHandle handle) {
                System.out.println(handle.getState() + " new  state");
            }

            @Override
            public void infoChanged(SparkAppHandle handle) {
                System.out.println(handle.getState() + " new  state");
            }
        });

        System.out.println(handle.getState().toString());

        while (!handle.getState().isFinal()) {
            //await until job finishes
            Thread.sleep(1000L);
        }
    }
}

这是有效的示例代码