使用正则表达式替换json

时间:2018-03-04 09:37:38

标签: python regex regular-language

我的json字符串中出现了一个意外的引号,导致json.loads(jstr)失败。

json_str = '''{"id":"9","ctime":"2018-02-13","content":"abcd: "efg.","hots":"103b","date_sms":"2017-11-22"}'''

所以我想使用正则表达式匹配并删除“content”值内的引号。我在other solution尝试了一些事情:

import re
json_str = '''{"id":"9","ctime":"2018-02-13","content":"abcd: "efg.","hots":"103b","date_sms":"2017-11-22"}'''
pa = re.compile(r'(:\s+"[^"]*)"(?=[^"]*",)')
pa.findall(json_str)

[out]: []

有没有办法修复字符串?

1 个答案:

答案 0 :(得分:0)

我使用的可能解决方案:

whole = []
count = 0
with open(filename) as fin:
    for eachline in fin:
        pa = re.compile(r'"content":\s?"(.*?","\w)')
        for s in pa.findall(eachline):
            s = s[:-4]
            s_fix = s.replace("\"","")
            eachline = eachline.replace(s,s_fix)

        data = json.loads(eachline)
        whole.append(data)