Question

我正在尝试使用Pandas处理来自Symphony的日志，但是对于我无法解析的格式错误的JSON有些麻烦。日志的一个例子：

'{id:46025,
work_assignment:43313=>43313,
declaration:<p><strong>Bijkomende interventie.</strong></p>\r\n\r\n<p>H&nbsp;</p>\r\n\r\n<p><strong><em>Vaststellingen.</em></strong></p>\r\n\r\n<p><strong><em>CV. </em></strong>De.</p>=><p><strong>Bijkomende interventie.</strong></p>\r\n\r\n<p>He&nbsp;</p>\r\n\r\n<p><strong><em>Vaststellingen.</em></strong></p>\r\n\r\n<p><strong><em>CV. </em></strong>De.</p>,conclusions:<p>H&nbsp;</p>=><p>H&nbsp;</p>}'

处理此问题的最佳方法是什么？对于每个部分（id / work_assignment / declaration / etc），我想检索旧值和新值（由“=＆gt;”分隔）。

Answer 1

使用以下代码：

def clean(my_log):
    my_log.replace("{", "").replace("}", "")  # Removes the unneeded { }
    my_items = list(my_log.split(","))        # Split at the comma to get the pairs
    my_dict = {}

    for i in my_items:
        key, value = i.split(":")             # Split at the colon to separate the key and value
        my_dict[key] = value                  # Add to the dictionary
    return my_dict

函数返回一个Python字典，如果需要，可以使用序列化程序将其转换为JSON，或直接使用。

希望我帮助过：D

在Python中处理格式错误的JSON字符串

1 个答案: