Schema将json数据加载到google大查询

时间:2014-07-08 04:02:17

标签: json google-bigquery

我对我们正在做的项目有疑问......

我尝试将此JSON提取到Google Big Query,但无法从JSON输入中获取JSON投票对象字段。我在模式中尝试了“记录”和“字符串”类型。

{
    "votes": {
        "funny": 10,
        "useful": 10,
        "cool": 10
    },
    "user_id": "OlMjqqzWZUv2-62CSqKq_A",
    "review_id": "LMy8UOKOeh0b9qrz-s1fQA",
    "stars": 4,
    "date": "2008-07-02",
    "text": "This is what this 4-star bar is all about.",
    "type": "review",
    "business_id": "81IjU5L-t-QQwsE38C63hQ"
}

此外,我无法从类别和邻居JSON数组的JSON下面填充表格吗?我的架构应该用于这些输入?在这种情况下,文档并没有太多帮助,或者我可能没有找到正确的地方..

{
    "business_id": "Iu-oeVzv8ZgP18NIB0UMqg",
    "full_address": "3320 S Hill St\nSouth East LA\nLos Angeles, CA 90007",
    "schools": [
        "University of Southern California"
    ],
    "open": true,
    "categories": [
        "Medical Centers",
        "Health and Medical"
    ],
    "neighborhoods": [
        "South East LA"
    ]
}

我能够获得常规字段,但这就是它......任何帮助都表示赞赏!

1 个答案:

答案 0 :(得分:7)

对于business,您似乎希望学校成为重复的领域。您的架构应该是:

"schema": {
    "fields": [
        {
            "name": "business_id",
            "type": "string"
        }.
        {
            "name": "full_address",
            "type": "string"
        },
        {
            "name": "schools",
            "type": "string",
            "mode": "repeated"
        },
        {
            "name": "open",
            "type": "boolean"
        }
    ]
}

对于votes,您似乎想要录制。您的架构应该是:

"schema": {
    "fields": [
        {
            "name": "name",
            "type": "string"
        }.
        {
            "name": "votes",
            "type": "record",
            "fields": [
                {
                    "name": "funny",
                    "type": "integer",
                },
                {
                    "name": "useful",
                    "type": "integer"
                },
                {
                    "name": "cool",
                    "type": "integer"
                }
            ]
        },
    ]
}

Source