(已关闭)大型数据库的JSON到CSV转换

时间:2016-01-30 05:14:33

标签: json csv

我有一个包含超过一百万个JSON实体的.txt文件,其中包含从python程序生成的不同密钥。这只是一个例子。

{
    "category": "Athlete", 
    "website": "example.com", 
    "talking_about_count": 560, 
    "description": "xxx", 
    "id": "123"
}
{
    "category": "Community", 
    "talking_about_count": 0, 
    "name": "The Second Civil War",
    "likes": 26, 
    "id": "234", 
    "is_published": true
}

即使每个JSON具有不同的属性,它们也有共同的属性。 生成的.csv文件将包含列类别,网站,talking_about_count,description,id,name,likes,is_published like this

"category","website","talking_about_count","name","likes","description","id","is_published"
"Athlete","example.com","560","","","xxx","123",""
"Community","","0","The Second Civil War","26","","234","True"

https://json-csv.com/做得很漂亮,但无法处理超过1000个实体的数据集。

我想从包含一百万个JSON实体的.txt文件创建一个CSV,我想知道是否有更好的方法来解决这个问题。

1 个答案:

答案 0 :(得分:1)

以下是使用jq

的解决方案

如果文件filter.jq包含

  (reduce (.[]|keys_unsorted[]) as $k ({};.[$k]="")) as $o   # object with all keys
| ($o  | keys_unsorted), (.[] | $o * . | [.[]])              # generate header and data
| @csv                                                       # convert to csv

data.json包含示例数据,然后是命令

jq -M -s -r -f filter.jq data.json

将产生输出

"category","website","talking_about_count","description","id","name","likes","is_published"
"Athlete","example.com",560,"xxx","123","","",""
"Community","",0,"","234","The Second Civil War",26,true