Spring Data Flow Yarn - 无法访问jarfile

时间:2016-11-07 14:14:08

标签: spring-batch yarn spring-cloud spring-cloud-dataflow spring-cloud-task

我尝试在Spring Cloud Data Flow for Yarn上运行简单的spring批处理任务。不幸的是,在运行它时,我在ResourceManager UI中收到了错误消息:

Application application_1473838120587_5156 failed 1 times due to AM    Container for appattempt_1473838120587_5156_000001 exited with exitCode:  1
For more detailed output, check application tracking page:http://ip-10-249-9-50.gc.stepstone.com:8088/cluster/app/application_1473838120587_5156Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1473838120587_5156_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

Appmaster.stderror提供的更多信息表明:

Log Type: Appmaster.stderr
Log Upload Time: Mon Nov 07 12:59:57 +0000 2016
Log Length: 106
Error: Unable to access jarfile spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar

如果涉及Spring Cloud Data Flow,我正试图在dataflow-shell中运行:

app register --type task --name simple_batch_job --uri https://github.com/spring-cloud/spring-cloud-dataflow-samples/raw/master/tasks/simple-batch-job/batch-job-1.0.0.BUILD-SNAPSHOT.jar
task create foo --definition "simple_batch_job"
task launch foo

很难知道为什么会出现这种错误。我确信从dataflow-server到yarn的连接工作正常,因为在标准的HDFS本地化(/ dataflow)中,一些文件被复制(servers.yml,带有作业和实用程序的jar),但它在某种程度上是不可访问的。

我的servers.yml配置:

logging:
  level:
    org.apache.hadoop: DEBUG
    org.springframework.yarn: DEBUG
maven:
  remoteRepositories:
    springRepo:
      url: https://repo.spring.io/libs-snapshot
spring:
  main:
    show_banner: false
  hadoop:
    fsUri: hdfs://HOST:8020
    resourceManagerHost: HOST
    resourceManagerPort: 8032
    resourceManagerSchedulerAddress: HOST:8030
datasource:
    url: jdbc:h2:tcp://localhost:19092/mem:dataflow
    username: sa
    password:
    driverClassName: org.h2.Driver

我很高兴听到任何信息或弹簧纱提示和技巧来使这项工作。

PS:作为hadoop环境,我使用Amazon EMR 5.0

编辑:来自hdfs的递归路径:

drwxrwxrwx   - user hadoop          0 2016-11-07 15:02 /dataflow/apps
drwxrwxrwx   - user hadoop          0 2016-11-07 15:02 /dataflow/apps/stream
drwxrwxrwx   - user hadoop          0 2016-11-07 15:04 /dataflow/apps/stream/app
-rwxrwxrwx   3 user hadoop        121 2016-11-07 15:05 /dataflow/apps/stream/app/application.properties
-rwxrwxrwx   3 user hadoop       1177 2016-11-07 15:04 /dataflow/apps/stream/app/servers.yml
-rwxrwxrwx   3 user hadoop   60202852 2016-11-07 15:04 /dataflow/apps/stream/app/spring-cloud-deployer-yarn-appdeployerappmaster-1.0.0.RELEASE.jar
drwxrwxrwx   - user hadoop          0 2016-11-04 14:22 /dataflow/apps/task
drwxrwxrwx   - user hadoop          0 2016-11-04 14:24 /dataflow/apps/task/app
-rwxrwxrwx   3 user hadoop        121 2016-11-04 14:25 /dataflow/apps/task/app/application.properties
-rwxrwxrwx   3 user hadoop       2101 2016-11-04 14:24 /dataflow/apps/task/app/servers.yml
-rwxrwxrwx   3 user hadoop   60198804 2016-11-04 14:24 /dataflow/apps/task/app/spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar
drwxrwxrwx   - user hadoop          0 2016-11-04 14:25 /dataflow/artifacts
drwxrwxrwx   - user hadoop          0 2016-11-07 15:06 /dataflow/artifacts/cache
-rwxrwxrwx   3 user hadoop   12323493 2016-11-04 14:25 /dataflow/artifacts/cache/https-c84ea9dc0103a4754aeb9a28bbc7a4f33c835854-batch-job-1.0.0.BUILD-SNAPSHOT.jar
-rwxrwxrwx   3 user hadoop   22139318 2016-11-07 15:07 /dataflow/artifacts/cache/log-sink-rabbit-1.0.0.BUILD-SNAPSHOT.jar
-rwxrwxrwx   3 user hadoop   12590921 2016-11-07 12:59 /dataflow/artifacts/cache/timestamp-task-1.0.0.BUILD-SNAPSHOT.jar

1 个答案:

答案 0 :(得分:0)

由于hdfs有spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar且错误抱怨spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar,显然会出现错误版本的混合。

除非您手动构建分发,否则不确定如何获取快照?

我建议从http://cloud.spring.io/spring-cloud-dataflow-server-yarn中选择1.0.2。请参阅参考文档中的“下载和提取分发”。还要从hdfs中删除旧的/dataflow目录。