无法使用Cloud Dataflow将可空整数值写入BigQuery

时间:2016-07-23 06:55:17

标签: google-bigquery google-cloud-dataflow

我正在尝试使用Cloud Dataflow写入BigQuery表。此BigQuery表有一个整数列,设置为可为空。对于空值,它会出现以下错误:

  

无法将值转换为整数。领域:ITM_QT;值:

但是当我将同一列的数据类型转换为String时,它接受空值。

那么有没有办法使用Cloud Dataflow将空值写入整数列?

如果我将列数据类型更改为String,则会出现此错误。

1 个答案:

答案 0 :(得分:2)

不确定你做错了什么,但是下面的代码工作正常,并确实允许为Integer&编写null值。 BigQuery中的Float数据类型:

public static void main(String[] args) {
        DataflowPipelineOptions options = PipelineOptionsFactory.create().as(DataflowPipelineOptions.class);
        options.setRunner(DirectPipelineRunner.class);
        options.setProject("<project-id>");

        Pipeline pipeline = Pipeline.create(options);

        PCollection<TableRow> results = pipeline.apply("whatever", BigQueryIO.Read.from("<table-spec>")).apply(ParDo.of(new DoFn<TableRow, TableRow>() {
            @Override
            public void processElement(ProcessContext c) throws Exception {
                System.out.println(c.element());
                TableRow row = new TableRow();
                row.set("foo", null); //null FLOAT
                row.set("bar", null); //null INTEGER
                c.output(row);
            }
        }));

        List<TableFieldSchema> fields = new ArrayList<>();
        fields.add(new TableFieldSchema().setName("foo").setType("FLOAT"));
        fields.add(new TableFieldSchema().setName("bar").setType("INTEGER"));
        TableSchema schema = new TableSchema().setFields(fields);

        results.apply(BigQueryIO.Write
                .named("Write")
                .to("<project-id>:<dataset-name>.write_null_numbers_test")
                .withSchema(schema)
                .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
                .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));

        pipeline.run();
    }

enter image description here

enter image description here