我正在使用Flink v.1.4.0
。
我正在尝试使用DataSet API
到IntelliJ
开展工作。请注意,如果我通过Flink UI
运行相同的作业,则作业运行正常。为了运行作业,我需要首先通过环境变量指定要处理的数据量。当金额相对较小时,工作运行正常。但随着它变大,我开始得到以下错误:
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
31107 [main] ERROR com.company.someLib.SomeClass - Error executing pipeline
org.apache.flink.runtime.client.JobExecutionException: Couldn't retrieve the JobExecutionResult from the JobManager.
at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:300)
at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:387)
at org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:565)
at org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:539)
at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:193)
at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
at com.ubs.digital.comms.graph.emailanalyser.EmailAnalyserPipeline.lambda$runPipeline$1(EmailAnalyserPipeline.java:120)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at com.ubs.digital.comms.graph.emailanalyser.EmailAnalyserPipeline.runPipeline(EmailAnalyserPipeline.java:87)
at com.ubs.digital.comms.graph.emailanalyser.EmailAnalyserPipeline.main(EmailAnalyserPipeline.java:65)
Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
我可以看到建议是:
You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
但我怀疑问题比这更深入。但为了实现这一目标,我需要先配置akka.client.timeout
。我如何在IntelliJ中执行 ?超时应该多长时间?
此外,究竟是什么造成了这种情况?我是否需要增加堆内存或其他内容?感谢。
答案 0 :(得分:3)
我能够弄清楚它并不是那么困难。我所要做的就是转到Run > Edit Configurations
并在Configucation
字段的Program arguments
标签下添加以下内容:
-Dakka.client.timeout:600s
-Dakka.ask.timeout:600s
但是,我应该注意到,这并没有解决我完全没有解决的问题。
答案 1 :(得分:-1)
您可以通过flink配置文件设置此属性。见https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/config.html#distributed-coordination-via-akka
所以在flink-conf.yaml中你会添加例如:
akka.client.timeout: 10min
但似乎数据正在错误的地方处理。您是否可以在构造函数中而不是map
或run
函数中加载数据?