Apache Flink Set Operator Uid vs UidHash

时间:2017-09-08 08:35:35

标签: apache-flink flink-streaming

我正在使用Apache Flink 1.2.0。根据生产准备清单(https://ci.apache.org/projects/flink/flink-docs-release-1.2/ops/production_ready.html),建议为操作员设置UID,以确保保存点的兼容性。
我找不到flatMap的setUid()方法,但根据doc,我找到了uid()和setUidHash()。说

UID

"Sets an ID for this operator.

The specified ID is used to assign the same operator ID across job submissions (for example when starting a job from a savepoint)."

uidHash

"Sets an user provided hash for this operator. This will be used AS IS the create the JobVertexID.

The user provided hash is an alternative to the generated hashes, that is considered when identifying an operator through the default hash mechanics fails (e.g. because of changes between Flink versions)."

实际应该在flatMap上设置哪一个例如uid()或setUidHash()?或两者兼而有之?

1 个答案:

答案 0 :(得分:1)

建议在这种情况下使用

uid()方法。 setUidHash()应仅使用 作为修复使用默认uid创建的作业的解决方法,而不是使用用户定义的作业。它在javadoc中声明:

  

这应该用作解决方法或解决问题。提供的哈希需要在每次转换和作业时都是唯一的。否则,作业提交将失败。此外,您无法将用户指定的哈希值分配给运算符链中的中间节点,尝试这样做会让您的作业失败。