我正在尝试使用parse_table_schema_from_json
from apache_beam.io.gcp.bigquery import parse_table_schema_from_json
获取表架构
来自here
这是我的代码:
def getSchema(pathToJSON):
with open(pathToJSON) as data_file:
schema_data = json.dumps(json.load(data_file))
table_schema = parse_table_schema_from_json(schema_data)
# print(table_schema)
return table_schema
这是我得到的错误:
Traceback (most recent call last):
File "test_get_schema.py", line 16, in <module>
getSchema("vauto_table_schema.json")
File "test_get_schema.py", line 13, in getSchema
table_schema = parse_table_schema_from_json(schema_data)
File "/home/usr/.local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 269, in parse_table_schema_from_json
fields = [_parse_schema_field(f) for f in json_schema['fields']]
TypeError: list indices must be integers, not str
我的json文件如下:
[
{
"name": "StockNumber",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Product",
"type": "STRING",
"mode": "NULLABLE"
}
]
我想念什么?
答案 0 :(得分:1)
您需要字典而不是列表。修改您的函数和架构文件,如下所述,然后重试
def getSchema(pathToJSON):
schema_data = json.dumps(json.load(open("mapping.json")))
table_schema = parse_table_schema_from_json(schema_data)
# print(table_schema)
return table_schema
您的架构文件应为
{
"fields": [
{
"type": "INTEGER",
"name": "StockNumber",
"mode": "NULLABLE"
},
{
"type": "STRING",
"name": "Product",
"mode": "NULLABLE"
}
]
}