从CoderRegistry推断编码器失败:无法为com.google.api.services.bigquery.model.TableRow提供编码器

时间:2018-07-19 10:07:33

标签: google-cloud-platform airflow

当使用DataFlowRunner在本地运行时,我有一个Dataflow作业运行良好,但是当我尝试使用GCP的Composer / AirFlow运行它时,会给我一个错误:

Exception in thread "main" java.lang.IllegalStateException: Unable to return a default Coder for ConvertToYouTubeMetadata/ParDo(convertToTableRow$1)/ParMultiDo(convertToTableRow$1).output [PCollection]. Correct one of the following root causes:
    No Coder has been manually specified;  you may do so using .setCoder().
    Inferring a Coder from the CoderRegistry failed: Unable to provide a Coder for com.google.api.services.bigquery.model.TableRow.
    Building a Coder using a registered CoderProvider failed.
    See suppressed exceptions for detailed failures.
    Using the default output Coder from the producing PTransform failed: PTransform.getOutputCoder called.
    at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
    at org.apache.beam.sdk.values.PCollection.getCoder(PCollection.java:259)
    at org.apache.beam.sdk.values.PCollection.finishSpecifying(PCollection.java:107)
    at org.apache.beam.sdk.runners.TransformHierarchy.finishSpecifyingInput(TransformHierarchy.java:190)
    at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:536)
    at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:491)
    at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:299)
    at MainKt.runMetadataPipeline(main.kt:66)
    at MainKt.main(main.kt:34)

Composer上的执行方式有何不同,会导致它在本地运行时无法正常工作?

我只是在使用

BigQueryIO.writeTableRows()

1 个答案:

答案 0 :(得分:1)

使用ShadowJar构建我的JAR为我解决了这个问题。当Dataflow使用DataFlowJavaOperator执行JAR文件时,我认为打包JAR时出现问题。我不记得在Github上读过什么,但是有人提到使用Maven Shade插件来解决此问题,这与Gradle等效。