使用zookeeper实现具有高可用性的Flink:作业管理器不会确认提交的作业

时间:2017-03-08 05:45:19

标签: log4j apache-zookeeper apache-flink flink-streaming

我正在尝试在高可用性Zookeeper模式下运行Flink群集。对于HA群集的功能测试,我有5个Job-managers和1个​​Task-manager。在启动zookeeper-quorum和flink cluster之后,我将作业提交给job-manager但是我收到以下错误

log4j:WARN No appenders could be found for logger(org.apache.kafka.clients.consumer.ConsumerConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Submitting job with JobID: ac57484f600814326f28c941244a4c94. Waiting for job completion.
Connected to JobManager at Actor[akka.tcp://flink@192.168.140.53:6123/user/jobmanager#-2018623179]
Exception in thread "main" org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:410)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:383)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:375)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:209)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:173)
at MainAlert.main(MainAlert.java:136)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:137)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:406)
... 6 more
Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:264)
at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

我是否必须明确设置log4j.properties,还是导致此问题的其他原因? (6123是我的jobmanager.rpc.port,还有recovery.jobmanager.port)

0 个答案:

没有答案