org.apache.flink.client.program.ProgramInvocationException:无法检索执行结果

时间:2018-12-13 17:43:18

标签: apache-flink amazon-emr

我正在尝试在AWS EMR集群上启动Flink批处理作业,并得到:

The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: c7362754c49bdae8e9a46748d47bc180)
    at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:260)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:474)
    at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
    at com.stolencamerafinder.flink.BatchJob.main(BatchJob.java:69)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
    at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
    at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
    at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
    at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
    at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
    at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:379)
    at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
    at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
    at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
    at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
    at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
    at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
    at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
    at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
    at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
    ... 12 more
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
    ... 10 more
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
    at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
    at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
    at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
    at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
    at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
    ... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
    at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:310)
    at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:294)
    at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
    ... 5 more

我不知道根本原因是什么。在哪里可以找到有关出错原因的更多详细信息?

类似地,我找不到通过Flink / EMR查找成功工作日志的方法。有想法吗?

3 个答案:

答案 0 :(得分:1)

要查找更多详细信息,您需要JobManager或ResourceManager的更多信息。如果您像许多开发人员一样使用YARN作为群集资源管理框架,则可以访问ApplciationMatser的日志,这些日志是Flink中JobManager的日志。

答案 1 :(得分:0)

我遇到了同样的问题,原因是我没有启动flink集群。

./bin/start-cluster.sh

答案 2 :(得分:0)

我在本地Mac电脑上以本地模式遇到了此问题。 它在maven项目中运行良好,但是在提交给flink本地集群时会在上面抛出异常。

在重新启动Macbook后已解决