Question

我想从pilight解析日志。它具有多个JSON条目，主要是日期时间条目，而不是单个JSON字符串。因此，当我使用@时出现错误

json.decoder.JSONDecodeError：额外数据：第12行第1列（字符171）

下面是示例文件。我将如何解析这样的文件？

我想做的是禁止显示具有json.load(f)的完整JSON条目，但是将其第一个保留在非datetime的完整JSON条目之后。

因此，我将获得一个新的精简日志文件，其中包含“消息”和一个“日期时间”部分。在示例中，该文件仅包含前2个JSON条目。

"protocol": "datetime"

Answer 1

请参见How to extract multiple JSON objects from one file? 最简单的方法是分别在文件的开头和结尾添加[和]，并在任何单个json对象之间添加,。

一旦加载，并且您有一个“ json”对象列表，则可以执行以下操作来过滤它们：

filtered_jsons = [single_json for single_json in all_jsons if single_json.get('protocol') != "datetime"]

Answer 2

文件内容不是正确的json。您需要使用,分离对象，然后将所有内容放入列表[{...}, {...}, ... ]。

以下是示例代码：

# assuming we have loaded your file to a str variable 

file_content = '''
{
    "message": {
        "id": 31,
        "unit": 15,
        "state": "down"
    },
    "origin": "receiver",
    "protocol": "arctech_screen_old",
    "uuid": "0000-b8-27-eb-e85eff",
    "repeats": 1
}
{
    "origin": "receiver",
    "protocol": "datetime",
    "message": {
        "longitude": 9.000000,
        "latitude": 44.633000,
        "year": 2020,
        "month": 6,
        "day": 5,
        "weekday": 6,
        "hour": 12,
        "minute": 41,
        "second": 30,
        "dst": 1
    },
    "uuid": "0000-b8-27-eb-e85eff"
}
{
    "origin": "receiver",
    "protocol": "datetime",
    "message": {
        "longitude": 9.000000,
        "latitude": 44.633000,
        "year": 2020,
        "month": 6,
        "day": 5,
        "weekday": 6,
        "hour": 12,
        "minute": 41,
        "second": 31,
        "dst": 1
    },
    "uuid": "0000-b8-27-eb-e85eff"
}
'''

更改为正确的json看起来像：

import json
import re

proper_json_string = '[\n'+re.sub(r'}\n{', r'},\n{', file_content)+'\n]'
data = json.loads(proper_json_string)

解析并grep具有JSON格式条目的日志文件

2 个答案: