气流-BigQuery模式字段中的值无效

时间:2020-10-16 07:26:02

标签: python google-bigquery airflow google-cloud-composer

我有一个GCP云编写器,可以将数据从GCS加载到BQ中。我使用schema_fields选项传递源模式。我将源模式传递给变量。我是从xcom pull那里得到的。请参见下面的print(schema)

[{"mode": "NULLABLE", "name": "id", "type": "INTEGER"}, {"mode": "NULLABLE", "name": "c1", "type": "DATE"}, {"mode": "NULLABLE", "name": "c2", "type": "TIME"}, {"mode": "NULLABLE", "name": "c3", "type": "DATETIME"}, {"mode": "NULLABLE", "name": "c4", "type": "TIMESTAMP"}]

在BQ运算符中,我使用schema_fields=schema

但是当我运行dag时,它会抛出错误。

ERROR - <HttpError 400 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/xxx-xxx/jobs?
alt=json returned "Invalid value at 'job.configuration.load.schema.fields' (type.googleapis.com/google.cloud.bigquery.v2.TableFieldSchema), 
"[{"mode": "NULLABLE", "name": "id", "type": "INTEGER"}, {"mode": "NULLABLE", "name": "c1", "type": "DATE"}, {"mode": "NULLABLE", "name": "c2", "type": "TIME"}, {"mode": "NULLABLE", "name": "c3", "type": "DATETIME"}, {"mode": "NULLABLE", "name": "c4", "type": "TIMESTAMP"}]"">

但是当我将此模式另存为GCS中的文件并尝试使用schema_object时,它就可以了。但是通过变量进行同样的操作不起作用。

2 个答案:

答案 0 :(得分:2)

Schema_fields 参数接受有效列表。目前还没有模板化。

我使用过这个示例字段并且它正在工作。请检查类似的格式,看看它是否适合您。

schema_fields=[
                 {'name': 'Col1', 'type': 'STRING', 'mode': 'NULLABLE'},
                 {'name': 'Col2', 'type': 'STRING', 'mode': 'NULLABLE'},
             ]

答案 1 :(得分:1)

我能够通过使用json.loads解决此问题

schema=json.loads(xcom_pull commands)