在Python中解析复杂和更改的JSON数据,深入了解几个层次

时间:2014-02-15 19:49:18

标签: python json list parsing dictionary

我正在尝试解析更改的JSON数据,但是JSON数据有点复杂并且每次迭代都会更改。

正在循环中解析JSON数据,因此每次循环运行时,json数据都不同。我现在关注的是教育数据。

JSON数据:

第一个看起来像这样:

{u'gender': u'female', u'id': u'15394'}

下一个可能是:

{
u'gender': u'male', u'birthday': u'12/10/1983', u'location': {u'id': '12', u'name': u'Mexico City, Mexico'}, u'hometown': {u'id': u'19', u'name': u'Mexico City, Mexico'}, 

u'education': [
{
u'school': {u'id': u'22', u'name': u'Institut Saint Dominique de Rome'}, 
u'type': u'High School', 
u'year': {u'id': u'33', u'name': u'2002'}
}, 
{
u'school': {u'id': u'44', u'name': u'Instituto Cumbres'}, 
u'type': u'High School', 
u'year': {u'id': u'55', u'name': u'1999'}
}, 
{
u'school': {u'id': u'66', u'name': u'Chantemerle International School'},    
u'type': u'High School', 
u'year': {u'id': u'77', u'name': u'1998'}
}, 
{
u'school': {u'id': u'88', u'name': u'Columbia University'}, 
u'type': u'College', 
u'concentration': 
[{u'id': u'91', u'name': u'Economics'}, 
{u'id': u'92', u'name': u'Film Studies'}]
}
], 
u'id': u'100384'}

我正在尝试返回学校名称,学校ID和学校类型的所有值,因此我基本上希望[education][school][id][education][school][name][education][school][type]在一行中。但是,每个人都有不同数量的学校,不同类型的学校或根本没有学校。我想在现有循环中的新行上返回每个学校及其相关名称,id和类型。

IDEAL OUTPUT

1   34  Boston Latin School High School
1   26  Harvard University  College
1   22  University of Michigan  Graduate School

在这种情况下的一个是指一个friend_id,我已将其设置为附加到列表中作为每个循环中的第一项。

我试过了:

friend_data = response.read()
friend_json = json.loads(friend_data)

#This below is inside a loop pulling data for each friend:

try:
    for school_id in friend_json['education']:
        school_id = school_id['school']['id']
        friendedu.append(school_id)
    for school_name in friend_json['education']:
        school_name = school_name['school']['name']
        friendedu.append(school_name)
    for school_type in friend_json['education']:
        school_type = school_type['type']
        friendedu.append(school_type)
except:
    school_id = "NULL"

print friendedu writer.writerow(friendedu)

当前输出:

[u'22', u'44', u'66', u'88', u'Institut Saint Dominique de Rome', u'Instituto Cumbres', u'Chantemerle International School', u'Columbia University', u'High School', u'High School', u'High School', u'College']

此输出只是它已拉出的值的列表,而是我正在尝试组织输出,如上所示。我认为也许需要另一个for循环因为对于一个人我希望每个学校都在自己的路线上。现在,friendedu列表将一个人的所有教育信息附加到列表的每一行。我希望每个教育项目都在一个新行中,然后继续为下一个人写下行。

3 个答案:

答案 0 :(得分:1)

import csv
import json
import requests

def student_schools(student, fields=["id", "name", "type"], default=None):
    schools = student.get("education", [])
    return ((school.get(field, default) for field in fields) for school in schools)

def main():
    res = requests.get(STUDENT_URL).contents
    students = json.loads(res)

    with open(OUTPUT, "wb") as outf:
        outcsv = csv.writer(outf)
        for student in students["results"]:    # or whatever the root label is
            outcsv.writerows(student_schools(student))

if __name__=="__main__":
    main()

答案 1 :(得分:1)

你当然不需要更多的循环。

一个人会这样做:

friendedu = []
for school_id in friend_json['education']:
    friendedu.append("{id} {name} {type}".format(
        id=school_id['school']['id'],
        name=school_name['school']['name'],
        type=school_type['school']['type'])

或列表理解:

friendedu = ["{id} {name} {type}".format(
    id=school_id['school']['id'],
    name=school_name['school']['name'],
    type=school_type['school']['type']) for school_id in friend_json['education']]

答案 2 :(得分:1)

怎么样

friend_data = response.read()
friend_json = json.loads(friend_data)


if 'education' in friend_json.keys():
    for school_id in friend_json['education']:
        friendedu = []
        try:
            friendedu.append(school_id['school']['id'])
            friendedu.append(school_name['school']['name'])
            friendedu.append(school_type['school']['type'])
        except:
            friendedu.append('School ID, NAME, or type not found')
        print(" ".join(friendedu))