我正在使用此GitHub项目中的Spark JobServer Java客户端:
https://github.com/bluebreezecf/SparkJobServerClient
我能够将包含我想要执行的作业的Jar上传到Spark JobServer。日志表明它存储在/ tmp / spark-jobserver目录结构中。但是,当我从我创建的上下文访问它时,如果找不到Job类,那么就不会为我的Job请求加载Jar。
编辑:我后来发现Java客户端上传的jar已损坏。这就是Spark JobServer无法使用它的原因。当我用一个好的Jar手动替换它时,JobServer运行得很好。现在,真正的问题与客户端的uploadSparkJobJar()API有关。
org.khaleesi.carfield.tools.sparkjobserver.api.SparkJobServerClientException: Spark Job Server http://sparkjobserverhost:8090/响应404 {“status”: “错误”,“结果”:“classPath org.kritek.scalability.jobs.Olap1 not not not 找到了“}
这是我的代码:
//POST /contexts/<name>--Create context with parameters
Map<String, String> params = new HashMap<String, String>();
params.put(ISparkJobServerClientConstants.PARAM_MEM_PER_NODE, "512m");
params.put(ISparkJobServerClientConstants.PARAM_NUM_CPU_CORES, "10");
params.put("dependent-jar-uris", "file:///tmp/spark-jobserver/filedao/data/olap1_job-2016-08-11T04_47_07.802Z.jar");
boolean success = client.createContext(contextName, params);
assertTrue(success);
//dependent-jar-uris=file:///some/path/of/my-foo-lib.jar
//Post /jobs---Create a new job
params.put(ISparkJobServerClientConstants.PARAM_APP_NAME, appName);
params.put(ISparkJobServerClientConstants.PARAM_CLASS_PATH, "org.kritek.scalability.jobs.Olap1");
SparkJobResult result = null;
String jobId = null;
params.put(ISparkJobServerClientConstants.PARAM_CONTEXT, contextName);
params.put(ISparkJobServerClientConstants.PARAM_SYNC, "true");
result = client.startJob("conf-1=1", params);
答案 0 :(得分:0)
Check the answer from here spark submit java.lang.ClassNotFoundException
Note: if you are used maven project you could use mvn package assembly:single to include your dependecies.
spark-submit --class Test --master yarn --deploy-mode cluster --supervise --verbose jarName.jar hdfs:///somePath/Test.txt hdfs:///somePath/Test.out
Try to use, also you could check the absolute path in your project
--class com.myclass.Test