Question

我有一个按照提取时间划分的bigquery表。我想在每次将数据加载到该表时对其进行分类。

我正在使用python，下面是代码sinnpet：

(success_records
 | 'Extracting row from tagged row {}'.format(url) >> beam.Map(lambda row: row['row'])
 | 'Write to BigQuery table for {}'.format(url) >> beam.io.WriteToBigQuery(
            table=data_ingestion.get_table(tmp=TEST, run_date=data_ingestion.run_date),
            create_disposition=beam.io.BigQueryDisposition.CREATE_NEVER,
            write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
            )
)

此操作之后，表分区信息将丢失。 bigquery用户界面不显示任何分区元数据。

我也正在使用apache beam，我尝试过的 1.使用$ YYYYMMDD传递表名，但是光束结束给出此错误。

Invalid table ID \"review_raw$20190111\". Table IDs must be alphanumeric (plus underscores) and must be at most 1024 characters long. Also, Table decorators cannot be used.

没有$ YYYYMMDD的传递表，最终删除了分区信息。

如何转向保留分区元数据的bigquery表？

0 个答案: