Question

如this StackOverflow question中所述，您不能在json中使用任何尾随逗号。例如，这个

{
    "key1": "value1",
    "key2": "value2"
}

很好，但是这个

{
    "key1": "value1",
    "key2": "value2",
}

语法无效。

由于this other StackOverflow question中提到的原因，在Python代码中使用尾随逗号是合法的（也许可以鼓励？）。我正在使用Python和JSON，所以我希望能够在两种类型的文件中保持一致。有没有办法让json.loads忽略尾随逗号？

Answer 1

在传递值之前删除逗号。

import re

def clean_json(string):
    string = re.sub(",[ \t\r\n]+}", "}", string)
    string = re.sub(",[ \t\r\n]+\]", "]", string)

    return string

Answer 2

你可以用jsoncomment

包装python的json解析器

JSON Comment允许使用：
解析JSON文件或字符串

单行和多行评论

多行数据字符串

在最后一项
之后的对象和数组中尾随逗号

使用示例：

import selenium
import re

form = browser.find_element_by_tag_name('form')
formID = form.get_attribute('id')
re.findall('\d+', formID)
print formIDNumber

Answer 3

在python中，字典和列表中可以包含逗号，因此我们应该能够使用 ast.literal_eval ：

import ast, json

str = '{"key1": "value1", "key2": "value2",}'

python_obj = ast.literal_eval(str) 
# python_obj is {'key1': 'value1', 'key2': 'value2'}

json_str = json.dumps(python_obj)
# json_str is '{"key1": "value1", "key2": "value2"}'

但是，JSON并非完全是 python，因此存在一些极端情况。例如，python中不存在诸如 null，true，false 之类的值。我们可以在运行eval之前用有效的python等效项替换它们：

import ast, json

def clean_json(str):
  str = str.replace('null', 'None').replace('true', 'True').replace('false', 'False')
  return json.dumps(ast.literal_eval(str))

不幸的是，这将破坏其中包含单词 null，true或false 的所有字符串。

{"sentence": "show your true colors"}

将成为

{"sentence": "show your True colors"}

Answer 4

将其他一些答案中的知识整合在一起，尤其是使用@Porkbutts答案中的literal_eval的想法，我提出了解决这个问题的疯狂方法

def json_cleaner_loader(path):
    with open(path) as fh:
        exec("null=None;true=True;false=False;d={}".format(fh.read()))
    return locals()["d"]

这是通过在将JSON结构评估为Python代码之前将缺失的常量定义为它们的Pythonic值而起作用的。然后可以从locals()（又是另一本字典）访问该结构。

这同时适用于Python 2.7和Python 3.x

注意，这将执行传递的文件中的内容，这可能会执行Python解释器可以执行的任何操作，因此，它只能用于已知安全的输入（即，不要让网络客户端提供内容），并且可能不在任何生产环境中使用。
如果提供了大量内容，这可能也会失败。

Answer 5

使用rapidjson

rapidjson.load("file.json", parse_mode = rapidjson.PM_COMMENTS | rapidjson.PM_TRAILING_COMMAS)

Answer 6

快进到 2021 年，现在我们有 https://pypi.org/project/json5/

来自链接的引用：

<块引用>

JSON5 数据格式的 Python 实现。

JSON5 扩展了 JSON 数据交换格式，使其稍微更适合用作配置语言：

JavaScript 风格的注释（单行和多行）都是合法的。
如果对象键是合法的 ECMAScript 标识符，则可以不加引号
对象和数组可能以逗号结尾。
字符串可以是单引号的，多行字符串文字是允许。

用法与python内置的json模块一致：

>>> import json5
>>> json5.loads('{"key1": "{my special value,}",}')
{u'key1': u'{my special value,}'}

它确实带有警告：

<块引用>

已知问题

我有没有提到它很慢？

加载启动配置等速度足够快

Answer 7

如果我没有使用任何外部模块的选项，我的典型方法是首先清理输入（即删除尾随逗号和注释），然后使用内置的 JSON 解析器。

以下示例使用三个正则表达式去除单行和多行注释，然后在 JSON 输入字符串上添加尾随逗号，然后将其传递给内置的 json.loads 方法。

#!/usr/bin/env python

import json, re, sys

unfiltered_json_string = '''
{
    "name": "Grayson",
    "age": 45,
    "car": "A3",
    "flag": false,
    "default": true,
    "entries": [ // "This is the beginning of the comment with some quotes" """""
        "red", // This is another comment. " "" """ """"
        null, /* This is a multi line comment //
"Here's a quote on another line."
*/
        false,
        true,
    ],
    "object": {
        "key3": null,
        "key2": "This is a string with some comment characters // /* */ // /////.",
        "key1": false,
    },
}
'''

RE_SINGLE_LINE_COMMENT = re.compile(r'("(?:(?=(\\?))\2.)*?")|(?:\/{2,}.*)')
RE_MULTI_LINE_COMMENT = re.compile(r'("(?:(?=(\\?))\2.)*?")|(?:\/\*(?:(?!\*\/).)+\*\/)', flags=re.M|re.DOTALL)
RE_TRAILING_COMMA = re.compile(r',(?=\s*?[\}\]])')

if sys.version_info < (3, 5):
    # For Python versions before 3.5, use the patched copy of re.sub.
    # Based on https://gist.github.com/gromgull/3922244
    def patched_re_sub(pattern, repl, string, count=0, flags=0):
        def _repl(m):
            class _match():
                def __init__(self, m):
                    self.m=m
                    self.string=m.string
                def group(self, n):
                    return m.group(n) or ''
            return re._expand(pattern, _match(m), repl)
        return re.sub(pattern, _repl, string, count=0, flags=0)
    filtered_json_string = patched_re_sub(RE_SINGLE_LINE_COMMENT, r'\1', unfiltered_json_string)
    filtered_json_string = patched_re_sub(RE_MULTI_LINE_COMMENT, r'\1', filtered_json_string)
else:
    filtered_json_string = RE_SINGLE_LINE_COMMENT.sub(r'\1', unfiltered_json_string)
    filtered_json_string = RE_MULTI_LINE_COMMENT.sub(r'\1', filtered_json_string)
filtered_json_string = RE_TRAILING_COMMA.sub('', filtered_json_string)

json_data = json.loads(filtered_json_string)
print(json.dumps(json_data, indent=4, sort_keys=True))

json.loads可以忽略尾随逗号吗？

7 个答案: