如何使用Python从CSV文件生成嵌套的JSON数据

时间:2019-11-20 07:46:31

标签: python json firebase firebase-realtime-database

我尝试使用在线Jsonify It工具,该工具可以从我的数据创建嵌套的JSON数据,但似乎无法正常工作。我也尝试过使用其他文章中的Python代码,但它们似乎也不起作用。如果您知道一种比使用Python更简单的方法,那会很好。

这是我的.CSV数据:

ID,Name,Date,Subject,Start,Finish
0,Ladybridge High School,01/11/2019,Maths,05:28,0
0,Ladybridge High School,02/11/2019,Maths,05:30,06:45
0,Ladybridge High School,01/11/2019,Economics,11:58,12:40
0,Ladybridge High School,02/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,01/11/2019,Maths,05:28,06:45
1,Loreto Sixth Form,02/11/2019,Maths,05:30,06:45
1,Loreto Sixth Form,01/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,02/11/2019,Economics,11:58,12:40

这是我想要的嵌套JSON结构:

{
  "Timetable" : [ {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Ladybridge High School"
  }, {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Loreto Sixth From"
  } ]
}

2 个答案:

答案 0 :(得分:2)

像这样吗?

[编辑]

我对其进行了重构,以处理时间表中每个条目的任意顶级键。我还让它首先创建了一个dict,然后将其转换为列表,以便在输入非常大的情况下可以在O(N)时间运行。

import csv
timetable = {}
with open('data.csv') as f:
    csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)]
for row in csv_data:
    if not timetable.get(row["ID"]):
        timetable[row["ID"]] = {"ID": row["ID"], "Date": {}}
    for k in row.keys():
        # Date has to be handled as a special case
        if k == "Date":
            timetable[row["ID"]]["Date"][row["Date"]] = {}
            timetable[row["ID"]]["Date"][row["Date"]][row["Subject"]] = {
                "Start": row["Start"],
                "Finish": row["Finish"]
            }
        # Ignore these keys because they are only for 'Date'
        elif k == "Start" or k == "Finish" or k == "Subject":
            continue
        # Use everything else
        else:
            timetable[row["ID"]][k] = row[k]
timetable = {"Timetable": [v for k, v in timetable.items()]}

答案 1 :(得分:0)

上述答案的改进,用于将ID嵌套在名称和日期之前

import csv timetable = {"Timetable": []} print(timetable) with open("C:/Users/kspv914/Downloads/data.csv") as f: csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)] name_array = [] for name in [row["Name"] for row in csv_data]: name_array.append(name) name_set = set(name_array) for name in name_set: timetable["Timetable"].append({"Name": name, "Date": {}}) for row in csv_data: for entry in timetable["Timetable"]: if entry["Name"] == row["Name"]: entry["Date"][row["Date"]] = {} entry["Date"][row["Date"]][row["Subject"]] = { "Start": row["Start"], "Finish": row["Finish"] } print(timetable)