如何从json文件中获取表架构:parse_table_schema_from_json?

时间:2018-12-14 18:09:04

标签: python json google-bigquery

我正在尝试使用parse_table_schema_from_json from apache_beam.io.gcp.bigquery import parse_table_schema_from_json获取表架构 来自here

这是我的代码:

 def getSchema(pathToJSON):
    with open(pathToJSON) as data_file:
        schema_data = json.dumps(json.load(data_file))
    table_schema = parse_table_schema_from_json(schema_data)
    # print(table_schema)
    return table_schema

这是我得到的错误:

    Traceback (most recent call last):
  File "test_get_schema.py", line 16, in <module>
    getSchema("vauto_table_schema.json")
  File "test_get_schema.py", line 13, in getSchema
    table_schema = parse_table_schema_from_json(schema_data)
  File "/home/usr/.local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 269, in parse_table_schema_from_json
    fields = [_parse_schema_field(f) for f in json_schema['fields']]
TypeError: list indices must be integers, not str

我的json文件如下:

[
  {
    "name": "StockNumber",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "name": "Product",
    "type": "STRING",
    "mode": "NULLABLE"
  }
]

我想念什么?

1 个答案:

答案 0 :(得分:1)

您需要字典而不是列表。修改您的函数和架构文件,如下所述,然后重试

  def getSchema(pathToJSON):
    schema_data = json.dumps(json.load(open("mapping.json")))
    table_schema = parse_table_schema_from_json(schema_data)
    # print(table_schema)
    return table_schema

您的架构文件应为

{
  "fields": [
   {
    "type": "INTEGER",
    "name": "StockNumber",
    "mode": "NULLABLE"
  },
  {
    "type": "STRING",
    "name": "Product",
    "mode": "NULLABLE"
  }
  ]
}