使用自定义分区从GCS上传到bigquery

时间:2019-08-20 07:07:36

标签: google-bigquery google-cloud-composer

我正尝试使用GoogleCloudStorageToBigQueryOperator将GCS存储桶中的数据上传到bigquery,如下所示,但是分区列(状态)填充为null。有什么想法吗?

gcs_to_bigquery = GoogleCloudStorageToBigQueryOperator(
task_id='gcs_to_bigquery',
bucket=gcs_bucket,
source_objects=[
    'parquet/sample_data/*'
],
destination_project_dataset_table='projectxxxxx.{}.sample_data'.format(schema),
source_format='PARQUET',
schema_fields=[{"name": "id", "type": "INTEGER", "mode": "NULLABLE"},
               {"name": "col1", "type": "STRING", "mode": "NULLABLE"},
               {"name": "col2", "type": "STRING", "mode": "NULLABLE"},
               {"name": "col3", "type": "STRING", "mode": "NULLABLE"},
               {"name": "col4", "type": "STRING", "mode": "NULLABLE"},
               {"name": "col5", "type": "STRING", "mode": "NULLABLE"},

               {"name": "state", "type": "STRING", "mode": "NULLABLE"}],
create_disposition='CREATE_IF_NEEDED',
write_disposition='WRITE_TRUNCATE',
dag=dag

0 个答案:

没有答案