我需要将oneline JSON文件转换为Newline Delimited格式,以将数据加载到BigQuery中。文件很大(从2到50 GB),因此代码
cat C: \ file.json | jq -c '. []'> C: \ file_ND.json
由于内存不足而无法工作。
我也尝试过:
jq -c '. []' C: \ file.json> C: \ file_ND.json
但是它返回一个空的输出文件。
任何想法如何在jq中使用--stream做到这一点?我找不到在哪里可以提供输入和输出文件以及如何描述格式。
示例文件结构:
[{"action":"U","cd":"2018-06-04T16:54:53.000+02:00","md":"2019-06-01T04:44:22.000+02:00","o":{"_id":3298153,"_type":"acc","parent":{"_id":3298153,"_type":"b","pb":0,"dp":0,"lb":0},"s":"X","sChangeDate":"2018-06-04T16:54:53.000+02:00","owner":8008711577,"aTypeID:1302,"pEx:false,"cAcc":false,"trnSID":3341650,"eo":false}},{"action":"U","cd":"2018-06-04T16:57:47.000+02:00","md":"2019-06-13T14:48:45.000+02:00","o":{"_id":3298372,"_type":"acc","parent":{"_id":3298372,"_type":"ab","pb":0,"dp":0,"lb":0},"s":"X","sChangeDate":"2018-06-04T16:57:47.000+02:00","owner":8008711796,"aTypeID:1302,"pEx:false,"cAcc":false,"trnSID":3342088,"eo":false}},{"action":"U","cd":"2018-07-13T00:53:30.000+02:00","md":"2019-06-11T18:49:03.000+02:00","o":{"_id":3667579,"_type":"acc","parent":{"_id":3667579,"_type":"ab","pb":0,"dp":0,"lb":0},"s":"X","sChangeDate":"2018-07-13T00:53:30.000+02:00","owner":8009080658,"aTypeID:1302,"pEx:false,"cAcc":false,"trnSID":4077943,"eo":false}},{"action":"U","cd":"2018-07-13T12:55:55.000+02:00","md":"2019-06-17T05:42:38.000+02:00","o":{"_id":3672013,"_type":"acc","parent":{"_id":3672013,"_type":"ab","pb":0,"dp":0,"lb":0},"s":"X","sChangeDate":"2018-07-13T12:55:55.000+02:00","owner":8009085060,"aTypeID:1302,"pEx:false,"cAcc":false,"trnSID":4086704,"eo":false}},
... ,
... ,
... ,
]
jq --stream命令应该是什么样?
jq -c --stream '.[] ????????????'
编辑
将jq从1.5升级到1.6后,问题似乎已解决。使用的代码行:
jq -nc --stream 'fromstream(1|truncate_stream(inputs))' C:\input.json > C:\output.json