使用谷歌云部署管理器YAML文件创建bigquery表

时间:2017-03-24 01:45:12

标签: google-bigquery google-cloud-platform

我正在尝试使用Deployment Manager通过以下YAML文件创建bigquery表:

进口:

- path: schema.txt

资源:

- name: test
  type: bigquery.v2.table
  properties:
    datasetId: test_dt
    tableReference:
       datasetId: test_dt
       projectId: test_dev
       tableId: test
    schema:
       fields: {{ imports["schema.txt"] }}

但是当我尝试通过.txt文件提供表模式定义时,我得到一个解析错误。如果我给出架构定义而不是.txt文件,那么脚本会成功运行。导入文本文件的方法在google云帮助中给出。任何人都可以帮我这个吗?

3 个答案:

答案 0 :(得分:0)

我认为部署管理器格式化txt文件内容的方式可能不正确。调试此方法的一种好方法是收集http请求跟踪并比较两个请求之间的差异。

答案 1 :(得分:0)

这是Yaml,我们可以在bigquery部署管理器中使用嵌套或重复字段进行工作。

# Example of the BigQuery (dataset and table) template usage.
#
# Replace `<FIXME:my_account@email.com>` with your account email.

imports:
  - path: templates/bigquery/bigquery_dataset.py
    name: bigquery_dataset.py
  - path: templates/bigquery/bigquery_table.py
    name: bigquery_table.py

resources:
  - name: dataset_name_here
    type: bigquery_dataset.py
    properties:
      name: dataset_name_here
      location: US
      access:
        - role: OWNER
          userByEmail: my_account@email.com

  - name: table_name_here
    type: bigquery_table.py
    properties:
      name: table_name_here
      datasetId: $(ref.dataset_name_here.datasetId)
      timePartitioning:
        properties:       
          field:
            type: DAY
      schema:
        - name: column1 
          type: STRUCT
          fields:
            - name: column2
              type: string

        - name: test1
          type: RECORD
          mode: REPEATED

          fields:
            - name: test2
              type: string

答案 2 :(得分:0)

使用部署管理器在BigQuery中创建视图的YAML示例

注意::此YAML还显示了如何在表(hello_table)上创建分区(_PARTITIONTIME)

# Example of the BigQuery (dataset and table) template usage.
# Replace `<FIXME:my_account@email.com>` with your account email.

imports:
  - path: templates/bigquery/bigquery_dataset.py
    name: bigquery_dataset.py
  - path: templates/bigquery/bigquery_table.py
    name: bigquery_table.py

resources:
  - name: dataset_name
    type: bigquery_dataset.py
    properties:
      name: dataset_name
      location: US
      access:
        - role: OWNER
          userByEmail: my_account@email.com

  - name: hello
    type: bigquery_table.py
    properties:
      name: hello_table
      datasetId: $(ref.dataset_name.datasetId)
      timePartitioning:
        type: DAY

      schema:
      - name: partner_id
        type: STRING
  - name: view_step
    type: bigquery_table.py
    properties:
      name: hello_view
      datasetId: $(ref.dataset_name.datasetId)
      view:
        query: select partner_id from `project_name.dataset_name.hello_table`
        useLegacySql: False