如何为数组定义JSON模式?

时间:2018-06-11 20:44:22

标签: arrays json google-bigquery

我有这个JSON文件:

    {  
        "tcCensusTractBlockFacesStatefp":"01",
        "tcCensusTractBlockFacesCountyfp":"001",
        "tcCensusTractBlockFacesTractce":"020200",
        "tcCensusTractBlockFacesBlockce10":"2022",
        "tcCensusTractBlockFacesGeoid":"010010202002022",
        "cellIDs":[  
        9839958675010879488,
        9839958675082706944,
        9839958677655912448,
        9839958677556297728,
        9839958676975910912,
        9839958677063991296,
        9839958677105934336,
        9839958679922409472,
        9839958679922933760,
        9839958679975886848,
        9839958679979032576,
        9839958679461036032,
        9839958679450550272,
        9839958678956670976,
        9839958678926262272,
        9839958667678187520,
        9839958667562844160,
        9839958675010879488
        ]
    }

如何定义架构以在BigQuery中导入它?这样?

bq mk --table $DATASET:$TABLE tl_2017_schema.json
bq load --source_format=NEWLINE_DELIMITED_JSON $DATASET:$TABLE $WNAME 

这是我尝试做的事情:

tl_2017_schema.json

...
---
     {
        "name": "cellIDs",
        "type": "RECORD",
        "mode": "REPEATED",
        "fields": [
            {
                "name": "cellID",
                "type": "INT64",
                "mode": "NULLABLE"
            }
        ]
    }
...

但是这个负责ARRAY的部分与我在JSON文件中的数组结构不匹配:

“cellIDs”:[
    9839958675010879488,     9839958675082706944,     9839958677655912448,     9839958677556297728,     9839958676975910912,     9839958677063991296,     9839958677105934336,     9839958679922409472,     9839958679922933760,     9839958679975886848,     9839958679979032576,     9839958679461036032,     9839958679450550272,     9839958678956670976,     9839958678926262272,     9839958667678187520,     9839958667562844160,     9839958675010879488 ]

怎么做?

1 个答案:

答案 0 :(得分:1)

运行CREATE TABLE statement

可能更容易
CREATE TABLE dataset.tablename
(
  ...
  cellIDs ARRAY<INT64>
)

如果您确实要将架构指定为JSON,则示例中的修复方法是使cellIDs成为REPEATED INTEGER类型:

{
    "name": "cellIDs",
    "type": "INTEGER",
    "mode": "REPEATED"
}