Cloud Dataflow的RuntimeException与序列化Code​​r有关

时间:2016-01-14 01:56:36

标签: java google-cloud-dataflow

我在运行Dataflow管道时遇到了一个奇怪的问题。我已经编写了自己的编码器,但是使用AvroCoder,SerializableCoder和其他示例切换出来也产生了同样的问题。

在尝试使用流模式下的Dataflow Service启动管道之后,我得到了一个例外:

Exception in thread "main" java.lang.RuntimeException: Unable to deserialize Coder: ModelCoder. Check that a suitable constructor is defined.  See Coder for details.
  at com.google.cloud.dataflow.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:113)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.ensureCoderSerializable(DirectPipelineRunner.java:901)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.ensurePCollectionEncodable(DirectPipelineRunner.java:861)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.setPCollectionValuesWithMetadata(DirectPipelineRunner.java:789)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.setPCollection(DirectPipelineRunner.java:776)
  at com.google.cloud.dataflow.sdk.io.TextIO.evaluateReadHelper(TextIO.java:786)
  at com.google.cloud.dataflow.sdk.io.TextIO.access$000(TextIO.java:118)
  at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:327)
  at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:323)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:706)
  at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:219)
  at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
  at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:102)
  at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:252)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:662)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:374)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:87)
  at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:174)
  at io.momentum.demo.models.pipeline.PlatformPipeline.main(PlatformPipeline.java:96)
Caused by: java.lang.IllegalStateException: Sub-class com.google.cloud.dataflow.sdk.util.CoderUtils$Jackson2Module$Resolver MUST implement `typeFromId(DatabindContext,String)
  at com.fasterxml.jackson.databind.jsontype.impl.TypeIdResolverBase.typeFromId(TypeIdResolverBase.java:77)
  at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:156)
  at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:106)
  at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:91)
  at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:142)
  at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:42)
  at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3760)
  at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2042)
  at com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:2529)
  at com.google.cloud.dataflow.sdk.util.Serializer.deserialize(Serializer.java:98)
  at com.google.cloud.dataflow.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:110)
  ... 18 more

我的实现Coder只是包装AvroCoder并挂钩我们自己的一些代码:

public final class ModelCoder<M extends AppModel> extends AtomicCoder<M> {
  public static <T extends AppModel> ModelCoder<T> of(Class<T> clazz) {
    return new ModelCoder<>(clazz);
  }

  @JsonCreator
  @SuppressWarnings("unchecked")
  public static ModelCoder<?> of(@JsonProperty("kind") String classType) throws ClassNotFoundException {
    Class<?> clazz = Class.forName(classType);
    return of((Class<? extends AppModel>) clazz);
  }

  private String kind;

  public ModelCoder(Class<M> type) {
    this.kind = type.getSimpleName();
  }

  @Override
  public void encode(M value, OutputStream outStream, Context context) throws IOException, CoderException {
    CoderInternals.encode(value, outStream, context, new TypeReference<TypedSerializedModel<M>>() { });
  }

  @Override
  public M decode(InputStream inStream, Context context) throws IOException, CoderException {
    return CoderInternals.decode(inStream, context, new TypeReference<TypedSerializedModel<M>>() { });
  }

  @Override
  public CloudObject asCloudObject() {
    CloudObject co = super.asCloudObject();
    co.set("kind", kind);
    return co;
  }
}

编码器在调用encode(..)decode(..)AppModel时按预期工作,但无论如何都会发生此异常。

1 个答案:

答案 0 :(得分:4)

您需要使用@JsonCreator标记的静态方法,以便服务可以在worker上实例化您的编码器。你也不应该覆盖asCloudObject();这决定了你的编码器将如何被序列化并发送给工人,你的代码将只发送一个序列化的AvroCoder。

例如,请查看NullableCoder.java(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/NullableCoder.java)以获取包含另一个编码器的编码器示例。