在json

时间:2019-01-25 00:28:42

标签: python json regex

我有一个看起来像这样的json:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

并且我想组合任何块/对象/列表(我不知道它叫什么),它们具有相同的键值。 像这样:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

我如何在python中使用正则表达式来做到这一点?或任何正则表达式查询?

此外,我尝试使用json.dumps()并从那里开始工作,但是由于某些原因,当我将其与包含阿拉伯字符的json一起使用时,它会吓跑并弄乱了整个内容。 所以很不幸,我坚持使用正则表达式。

感谢您的帮助:)

1 个答案:

答案 0 :(得分:2)

stdlib json提供了一个挂钩,以允许使用重复的密钥解码对象。这个简单的“扩展”挂钩应适用于您的示例数据:

def myhook(pairs):
    d = {}
    for k, v in pairs:
        if k not in d:
          d[k] = v
        else:
          d[k] += v
    return d

mydata = json.loads(bad_json, object_pairs_hook=myhook)

尽管the JSON specification中没有禁止重复键的内容,但应该首先避免使用它:

  

1.1。本文档中使用的约定

     

关键字“必须”,“不得”,“必需”,“应该”,“不能”,      “ SHOULD”,“ SHOULD NOT”,“推荐”,“ MAY”和“ OPTIONAL”      文档应按照[RFC2119]中的说明进行解释。

...

  
      
  1. 对象

         

    对象结构表示为一对大括号   零个或多个名称/值对(或成员)。名字是一个   串。每个名称后都有一个冒号,将名称分开   从值。单个逗号将值与后跟   名称。对象中的名称应唯一。

  2.