Apache Flink:setParallelism()和setMaxParallelism()有什么区别

时间:2019-02-06 20:06:26

标签: apache-flink flink-streaming

我尝试使用ExecutionConfig.setMaxParallelism()方法为Flink作业设置最大并行度,但是似乎没有用。

我还修改了标准的WordCount示例以运行一些测试,并且看来setMaxParallelism()方法对本地环境或独立群集没有任何影响。

setMaxParallelism()如何工作?

2 个答案:

答案 0 :(得分:0)

Flink提供两种设置:

  • setParallelism(x)将作业或操作员的并行度设置为x,即操作员的并行任务数。
  • setMaxParallelism(y)控制可将键控状态分配到的最大任务数,即操作员的最大有效并行度。操作员仍然可以有更多任务,但是只有y个任务会分配有键状态,并且可以用于处理。分配密钥状态的单位称为密钥组。

documentation更详细地解释了这些概念。

答案 1 :(得分:0)

我今天使用流而不是数据集进行了更多测试。这次,我看到了setMaxParallelism()的效果。

    public static void main(String[] args) throws Exception
    {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        env.getConfig().setMaxParallelism(4); // <-- effect

        DataStream<String> text = env.fromElements(WORDS);

        DataStream<Tuple2<String, Integer>> counts = text.flatMap(new Tokenizer()).keyBy(0).sum(1);

        counts.writeAsCsv("test.dat");

        env.execute("WordCount Example");
    }

客户看到的有趣错误

Caused by: org.apache.flink.runtime.JobException: Vertex Flat Map's parallelism (8) is higher than the max parallelism (4). Please lower the parallelism or increase the max parallelism.
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:188)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:830)
        at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:232)
        at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)
        at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1152)
        at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1132)
        at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294)
        at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157)
        ... 10 more

谢谢