解析缺少逗号的堆叠JSON并将其保存在单独的文件中

时间:2018-12-03 05:37:06

标签: python json python-3.x

我正在读取一个文件(test.json),其中包含不以逗号分隔的JSON对象

{
   "ID": "349878",
   "Name": user1
   "object_name": [
        "Vessel",
        "Sherds"]
}
{
   "ID": "349879",
   "Name": user2
}
{
   "ID": "349880",
   "Name": user3
}

我希望将每个对象存储在单独的文件中-其中ID是其文件名。

示例:文件349878.json应包含

{
   "ID": "349878",
   "Name": user1
   "object_name": [
        "Vessel",
        "Sherds"]
}

4 个答案:

答案 0 :(得分:1)

假设您的JSON数据已正确验证,如下所示:

[
    {
       "ID": "349878",
       "Name": "user1",
       "name": [
            "Vessel",
            "Sherds"]
    },
    {
       "ID": "349879",
       "Name": "user2"
    },
    {
       "ID": "349880",
       "Name": "user3"
    }
]

您可以使用JSON Formatter and Validator进行验证。

您可以从json.loads()中提取每个JSON对象,并使用json.dump()将每个对象写入单独的文件中:

from json import loads
from json import dump

with open("test.json") as json_file:
    data = loads(json_file.read())

    for obj in data:
        with open(obj["ID"] + ".json", mode="w") as out_file:
            dump(obj, out_file, indent=4)

这将产生以下JSON文件:

349878.json

{
    "ID": "349878",
    "Name": "user1",
    "name": [
        "Vessel",
        "Sherds"
    ]
}

349879.json

{
    "ID": "349879",
    "Name": "user2"
}

349880.json

{
    "ID": "349880",
    "Name": "user3"
}

答案 1 :(得分:0)

我不知道为什么您的json无效,因为您将“ NOT”分隔的逗号作为要求,所以我希望这个可以弄清楚您的问题。

import re

regex = r"\{(.*?)\}"

test_str = ("{\n"
            '"ID": "349878",\n'
            '"Name": user1\n'
            '"object_name": [\n'
            '"Vessel",\n'
            '"Sherds"]\n'
            "}\n"
            "{\n"
            '"ID": "349879",\n'
            '"Name": user2\n'
            "}\n"
            "{\n\n"
            '"ID": "349880",\n'
            '"Name": user3\n'
            "}")

matches = re.finditer(regex, test_str, re.MULTILINE | re.DOTALL)

for matchNum, match in enumerate(matches):
    for groupNum in range(0, len(match.groups())):
        with open("{}.txt".format(match.group(1)[7:17].replace(",", "").strip()), 'w') as fout:
            fout.write(match.group(0))

答案 2 :(得分:0)

您可能会使用str.split()并使用切片来查找ID并创建文件。如果不删除空格,则可以使用其他索引。

with open('test.json', 'r') as file:
# Get text without whitespace or newlines
text = file.read().replace(' ', '').replace('\n', '')
# Split by '{', discard first entry (will be empty)
objects = text.split('{')[1:]

for object in objects:
    # Add the split delimiter back
    object = '{' + object
    # Get the id relative to the json data
    id = object[ object.find('"ID"') + 6 :
                object.find('"Name"') - 2 ]
    # Add the file extension
    id += '.json'

    # If the file doesnt exist, create it and write the data
    with open(id, 'x') as file:
        file.write(object)

答案 3 :(得分:0)

如果您的json有效,请确保正确加载json。如我们所见,您的json未正确验证。因此,在实施任何解决方案之前,请确保您的json文件已正确验证。

我认为您的文件已正确加载,之后您可以对其进行如下操作。

var str='{"ID": "349878","Name": "user1","object_name":["Vessel","Sherds"]}{"ID": "349879","Name": "user2"}{"ID": "349880","Name": "user3"}'
var indices= [];
var secondIndices=[];
var newString='';
for(var i=0; i<str.length;i++) {
    if (str[i] === "{") indices.push(i);
	if (str[i] === "}") secondIndices.push(i);
}
for(var i=0;i<indices.length;i++)
{
   newString+=(str.substring(indices[i],(secondIndices[i]+1))+",");
}
newString="["+newString.substring(0,newString.lastIndexOf(","))+"]";
var JSONObj=JSON.parse(newString);
console.log(JSONObj);