Question

所以我一直在尝试分析大概以json格式给出的数据，但这些对象没有用逗号分隔。这是我的数据示例：

{
  "areaId": "Tracking001",
  "areaName": "Learning Theater Indoor",
  "color": "#99FFFF"
}
{
  "areaId": "Tracking001",
  "areaName": "Learning Theater Indoor",
  "color": "#33CC00"
}

它们有成千上万，因此无法手动分离它们。所以这是我的问题： -为了分析它，我是否必须将逗号分隔并放入总体密钥并使其他所有值都有价值？我是数据分析的初学者，尤其是对于json格式的数据，因此任何提示都将不胜感激。

Answer 1

raw_decode(s)中的json.JSONDecoder方法听起来像您所需要的。引用其文档字符串：

raw_decode（s）：从s解码JSON文档（以JSON文档开头的str），并返回2个元组的Python表示形式和s中的索引（文档结束处）。可以用来从结尾可能有无关数据的字符串中解码JSON文档。

用法示例：

import json

s = """{
  "areaId": "Tracking001",
  "areaName": "Learning Theater Indoor",
  "color": "#99FFFF"
}
{
  "areaId": "Tracking001",
  "areaName": "Learning Theater Indoor",
  "color": "#33CC00"
}"""
decoder = json.JSONDecoder()
v0, i = decoder.raw_decode(s)
v1, _ = decoder.raw_decode(s[i+1:]) # i+1 needed to skip line break

现在v0和v1保存已解析的json值。

如果您有数千个值，则可能要使用循环：

import json

with open("some_file.txt", "r") as f:
    content = f.read()
parsed_values = []
decoder = json.JSONDecoder()
while content:
    value, new_start = decoder.raw_decode(content)
    content = content[new_start:].strip()
    # You can handle the value directly in this loop:
    print("Parsed:", value)
    # Or you can store it in a container and use it later:
    parsed_values.append(value)

在我的计算机上为1000个以上的json值使用此代码大约需要0.03秒。但是，对于较大的文件，它将变得效率低下，因为它始终会读取完整的文件。

如何分析不被逗号分隔的json对象（最好在Python中）

1 个答案: