我的目标是将JSON文件转换为可以使用Python从Cloud Storage上传到BigQuery(as described here)的格式。
我尝试使用newlineJSON包进行转换,但是收到以下错误。
JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)
有人对此有解决方案吗?
这是示例JSON代码:
[{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
}
]
这是现有的python脚本:
with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
with nlj.open(url_convertedjson, "w") as dst_:
for line_ in src_:
dst_.write(line_)
答案 0 :(得分:7)
使用jq
的答案确实很有用,但是如果您仍然想使用Python(从问题中看来),可以使用内置的json
模块来实现。 / p>
import json
from io import StringIO
in_json = StringIO("""[{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
}
]""")
result = [json.dumps(record) for record in json.load(in_json)] # the only significant line to convert the JSON to the desired format
print('\n'.join(result))
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
*我在这里使用StringIO
和print
只是为了使示例更易于本地测试。
或者,您可以使用Python jq binding将其与the other answer结合使用。
答案 1 :(得分:4)
如果您愿意使用Python,请使用jq
:
$ cat a.json
[{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
}
]
$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
我使用的迭代器是'.[]'
遍历数组,-c
将每个JSON对象放在一行上。
资源: