Question

根据此处的文档：https://cloud.google.com/bigquery/docs/tables#creating_a_table_when_you_load_data，BigQuery应该可以根据数据创建表。

将数据加载到BigQuery中时，可以将数据加载到新表或分区中，可以将数据附加到现有表或分区中，或者可以覆盖表或分区。您无需在将数据加载到其中之前创建一个空表。您可以创建新表并同时加载数据。

但是，当我尝试从Java将数据流式传输到BigQuery时，却出现表不存在的错误。

这是一个插入语句的示例，该语句有效，但仅在我手动创建表之后：

InsertAllResponse response = bigQuery
        .insertAll(
                InsertAllRequest
                        .newBuilder(tableId)
                        .addRow(rowContent)
                        .build()
        );

我可以用Java创建架构，然后创建表，但是随后我必须不断检查架构是否已创建，然后才能流式传输到它。 generateBigQuerySchema是我创建的定义架构的方法。如果该模式已经存在，则下面的代码将失败，因此在创建它之前，我必须检查它是否存在。

InsertAllResponse response = bigQuery
        .create(requestLog.generateBigQuerySchema(tableId))
        .getBigQuery()
        .insertAll(
                InsertAllRequest
                        .newBuilder(tableId)
                        .addRow(rowContent)
                        .build()
        );

Answer 1

我认为您根据API Reference混合了两种不同的资源类型。我的意思是Jobs和Tabledata。

作业does loading，其中Tabledata中的insertAll方法，doesn't：

一次将数据流式传输到BigQuery一条记录中，而无需运行加载作业

我看到Google文档可能会像上面那样被误解，因为Introduction to Loading Data into BigQuery引用了流插入（insertAll）。如下所示：

您可以加载数据：

...通过使用streaming inserts ...
插入单个记录

在流插入的地方重定向到Streaming Data into BigQuery，它说明了流而不是加载：

您可以选择使用以下方法来代替使用作业将数据加载到BigQuery中：使用，一次将您的数据流式传输到BigQuery一条记录中 tabledata（）。insertAll（）方法。

关于streaming inserts (insertAll)的最后一点：

确保您具有以下权限：包含您的目标表。 该表必须存在，然后再开始除非您正在使用模板表，否则请向其中写入数据。欲了解更多有关模板表的信息，请参阅自动创建表使用模板表。

如果您仍然想同时加载模板表而不是流并同时创建表，请使用Jobs和load type of job（或其他类型，如果需要）

我的question中的示例代码：

Insert insert = bigquery.jobs().insert(projectId,
                   new Job().setConfiguration(
                            new JobConfiguration().setLoad(
                                   new JobConfigurationLoad()
                                                .setSourceFormat("NEWLINE_DELIMITED_JSON")
                                                .setDestinationTable(
                                                        new TableReference()
                                                                .setProjectId(projectId)
                                                                .setDatasetId(dataSetId)
                                                                .setTableId(tableId)
                                                )
                                                .setCreateDisposition("CREATE_IF_NEEDED")
                                                .setWriteDisposition(writeDisposition)
                                                .setSourceUris(Collections.singletonList(sourceUri))
                                                .setAutodetect(true)
                                )
                        ));

Job myInsertJob = insert.execute();

在BiqQuery中从Java创建数据加载表

1 个答案: