从Python

时间:2017-12-08 02:03:35

标签: javascript python json list

我最近做了一些网络抓取,最终取出了我想要的数据。但是,没有组织意识,因为它只是Python中的一个简单列表。它包含会议类型(LE / DI / SE),日期,时间,教授的姓名(为隐私编辑),以及其他一些值。

["LE", "A00", "MWF", "10:00a-10:50a", "MLK", "AUD", "Smith, John", "976539", "DI", "A01", "F", "5:00p-5:50p", "MLK", "AUD", "Smith, John", "FULL Waitlist(25)", "216", "FI", "03/17/2018", "S", "8:00a-10:59a", "TBA", "TBA", "LE", "B00", "MWF", "1:00p-1:50p", "WLH", "2005", "Smith, John", "927471", "DI", "B01", "F", "6:00p-6:50p", "MLK", "AUD", "Smith, John", "FULL Waitlist(32)", "200", "FI", "03/17/2018", "S", "8:00a-10:59a", "TBA", "TBA"]

正如您所看到的,这是一个丑陋的清单。我的目标是做到这样:

{
  "MATH101": {
    "LE": {

      sectionCode: 'A00',
      days: 'MFW',
      times: '10:00a-10:50am',
      building: 'MLK',
      room: 'AUD',
      instructor: 'Smith, John',

      "DI": {
        sectionCode: 'A01',
        days: 'F',
        times: '5:00-5:50pm',
        building: 'MLK',
        room: 'AUD',
        instructor: 'Smith, John',
        availableSeats: 'FULL Waitlist(25)',
        capacity: '216'
      }
    },

    "LE": {

      sectionCode: 'B00',
      days: 'MFW',
      times: '1:00a-1:50pm',
      building: 'MLK',
      room: 'AUD',
      instructor: 'Smith, John',

      "DI": {
        sectionCode: 'B01',
        days: 'F',
        times: '6:00-6:50pm',
        building: 'MLK',
        room: 'AUD',
        instructor: 'Smith, John',
        availableSeats: 'FULL Waitlist(32)',
        capacity: '200'
      }
    }
  }
}

每个讲座部分都有一个讨论,因此每个讲座有两个讲座时间和两个讨论。我不知道是否有更好的架构来存储这些数据,但我只能想出这个。

我想迭代整个列表并在" LE"之后开始保存值。或" DI"我在上面建立了有序的方式,但是我不知道如何在没有每个密钥的情况下创建json文件。

我对这一切都很陌生,我还没有找到解决方案。尝试转换成字典,但它不能满足我的需求。许多数据也被重复,例如教授的名字,但不一定总是一位教授,也不是两次讲座/讨论。我打算用所有可用的课程来做这个,所以这变得越来越复杂......

希望有人能够提供帮助,谢谢!

0 个答案:

没有答案