Formatting JSON output

时间:2016-04-21 22:09:00

标签: python arrays json dictionary output

I have a JSON file with key value pair data. My JSON file looks like this.

{
    "professors": [
        {
            "first_name": "Richard", 
            "last_name": "Saykally", 
            "helpfullness": "3.3", 
            "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=111119", 
            "reviews": [
                {
                    "attendance": "N/A", 
                    "class": "CHEM 1A", 
                    "textbook_use": "It's a must have", 
                    "review_text": "Tests were incredibly difficult (averages in the 40s) and lectures were essentially useless. I attended both lectures every day and still was unable to grasp most concepts on the midterms. Scope out a good GSI to get help and ride the curve."
                }, 
                {
                    "attendance": "N/A", 
                    "class": "CHEMISTRY1A", 
                    "textbook_use": "Essential to passing", 
                    "review_text": "Saykally really isn't as bad as everyone made him out to  be. If you go to his lectures he spends about half the time blowing things up, but if you actually read the texts before his lectures and pay attention to what he's writing/saying, you'd do okay. He posts practice tests that were representative of actual tests and curves the class nicely!"
                }]
         {
      {
        "first_name": "Laura", 
        "last_name": "Stoker", 
        "helpfullness": "4.1", 
        "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=536606", 
        "reviews": [
            {
                "attendance": "N/A", 
                "class": "PS3", 
                "textbook_use": "You need it sometimes", 
                "review_text": "Stoker is by far the best professor.  If you put in the effort, take good notes, and ask questions, you will be fine in the class. As far as her lecture, she does go a bit fast, but her lecture is in the form of an outline. As long as you take good notes, you will have everything you need for exams. She is funny and super nice if you speak with her"
            }, 
            {
                "attendance": "Mandatory", 
                "class": "164A", 
                "textbook_use": "Barely cracked it open", 
                "review_text": "AMAZING professor.  She has a good way of keeping lectures interesting.  Yes, she can be a little everywhere and really quick with her lecture, but the GSI's are useful to make sure you understand the material.  Oh, and did I mention she's hilarious!"
            }]
    }]

So I'm trying to do multiple things. I'm trying to get the most mentioned ['class'] key under reviews. Then get the class name and the times it was mentioned. Then I'd like to output my format in this manner. Also under professor array. It's just the info of professors for instance for CHEM 1A, CHEMISTRY1A - It's Richard Saykally.

{
    courses:[
    {
       "course_name" : # class name
       "course_mentioned_times" : # The amount of times the class was mentioned
       professors:[ #The professor array should have professor that teaches this class which is in my shown json file
         {
              'first_name' : 'professor name'
              'last_name' : 'professor last name'
         }
    }

So I'd like to sort my json file key-value where I have max to minimum. So far all I've been able to figure out isd

if __name__ == "__main__":
        open_json = open('result.json')
        load_as_json = json.load(open_json)['professors']
        outer_arr = []
        outer_dict = {}
        for items in load_as_json:

            output_dictionary = {}
            all_classes = items['reviews']
            for classes in all_classes:
                arr_info = []
                output_dictionary['class'] = classes['class']
                output_dictionary['first_name'] = items['first_name']
                output_dictionary['last_name'] = items['last_name']
                #output_dictionary['department'] = items['department']
                output_dictionary['reviews'] = classes['review_text']
                with open('output_info.json','wb') as outfile:
                    json.dump(output_dictionary,outfile,indent=4)

1 个答案:

答案 0 :(得分:0)

我认为这个程序符合您的要求:

import json


with open('result.json') as open_json:
    load_as_json = json.load(open_json)

courses = {}
for professor in load_as_json['professors']:
    for review in professor['reviews']:
        course = courses.setdefault(review['class'], {})
        course.setdefault('course_name', review['class'])
        course.setdefault('course_mentioned_times', 0)
        course['course_mentioned_times'] += 1
        course.setdefault('professors', [])
        prof_name = {
            'first_name': professor['first_name'],
            'last_name': professor['last_name'],
        }
        if prof_name not in course['professors']:
            course['professors'].append(prof_name)

courses = {
    'courses': sorted(courses.values(),
                      key=lambda x: x['course_mentioned_times'],
                      reverse=True)
}
with open('output_info.json', 'w') as outfile:
    json.dump(courses, outfile, indent=4)

结果,使用问题中的示例输入:

{
    "courses": [
        {
            "professors": [ 
                {
                    "first_name": "Laura",
                    "last_name": "Stoker"
                }
            ], 
            "course_name": "PS3", 
            "course_mentioned_times": 1
        }, 
        {
            "professors": [
                {
                    "first_name": "Laura", 
                    "last_name": "Stoker"
                }
            ],
            "course_name": "164A", 
            "course_mentioned_times": 1
        },
        {
            "professors": [
                {
                    "first_name": "Richard", 
                    "last_name": "Saykally"
                }
            ], 
            "course_name": "CHEM 1A", 
            "course_mentioned_times": 1
        }, 
        {
            "professors": [
                {
                    "first_name": "Richard", 
                    "last_name": "Saykally"
                }
            ], 
            "course_name": "CHEMISTRY1A", 
            "course_mentioned_times": 1
        }
    ]
}