Question

I have to tables with the same schema tab1 and tab1_partitioned where the latter is partitioned by day.

I am trying to insert data into the partitioned table with the following command:

bq query --allow_large_results --replace --noflatten_results --destination_table 'advertiser.development_partitioned$20160101' 'select * from advertiser.development where ymd = 20160101';

but I get the following error:

BigQuery error in query operation: Error processing job 'total-handler-133811:bqjob_r78379ac2513cb515_000001553afb7196_1': Provided Schema does not match Table

Both have exactly the same schema and I really don't understand why I am getting that error. Can someone shed some light on my issue?

In fact, I would prefer If BigQuery supported the dynamic partitioning insert that is supported in Hive, but some days of search seem to point that is not possible :-/

Answer 1

您看到的行为是由于我们在将表格分区与表格分区一起使用时如何对待写入处置。

您应该能够使用WRITE_APPEND处置方法附加到分区，以使查询通过。

bq query --allow_large_results --append_table --noflatten_results --destination_table 'advertiser.development_partitioned$20160101' 'select * from advertiser.development where ymd = 20160101';

将它与--replace一起使用有一些复杂性，但我们正在研究此时对表分区的改进架构支持。

如果这对您不起作用，请告诉我。谢谢！

要回答有关动态分区的问题的其他部分 - 我们计划支持更丰富的分区风格，我们相信它们将处理大多数用例。

Answer 2

仅供参考，我不总是这样，但是现在有了一种方法，只需使用bigquery UI中的DML，即可将数据从非分区表复制到bigquery中的分区表中。例如，如果您的原始表中有日期字符串，格式为YYYY-MM-DD，则可以运行此命令以将数据移动到分区表中...

create table my_dataset.my_table (sesh STRING, prod STRING) partition by DATE(_PARTITIONTIME);

insert into my_dataset.my_table (_PARTITIONTIME, sesh, prod) select CAST(PARSE_DATE('%Y-%m-%d',  mydatestr) as TIMESTAMP), sesh, prod from my_dataset.my_orig_table;

BigQuery insert into a partitioned table from an existing table

2 个答案: