json模块中的object_hook似乎并不像我期望的那样工作

时间:2017-04-07 19:57:40

标签: python json

我无法理解json.loads()中的object_hook功能是如何工作的。我在这里发现了一个类似的问题object_hook does not address the full json,但我已经尝试按照我的理解,并且它仍然不适合我。我已经收集到了object_hook函数以某种方式递归调用,但是我没有理解如何使用它来从json字符串构造复杂的对象层次结构。请考虑以下json字符串,类和object_hook函数:

import json
from pprint import pprint

jstr = '{"person":{ "name": "John Doe", "age": "46", \
           "address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'

class Address:
    def __init__(self, street=None, city=None, state=None):
        self.street = street
        self.city = city
        self.state = state

class Person:
    def __init__(self, name=None, age=None, address=None):
        self.name = name
        self.age = int(age)
        self.address = Address(**address)

def as_person(jdict):
    if u'person' in jdict:
        print('person found')
        person = jdict[u'person']
        return Person(name=person[u'name'], age=person[u'age'], 
                      address=person[u'address'])
    else:
        return('person not found')
        return jdict

(我使用关键字args定义类以提供默认值,以便json不需要包含所有元素,我仍然可以确保属性存在于类实例中。我最终还会将方法与类关联,但是需要从json数据填充实例。)

如果我跑:

>>> p = as_person(json.loads(jstr))

我得到了我的期望,即:

person found

和p成为Person对象,即:

>>> pprint(p.__dict__)
{'address': <__main__.Address instance at 0x0615F3C8>,
 'age': 46,
 'name': u'John Doe'}
>>> pprint(p.address.__dict__)
{'city': u'Boston', 'state': u'MA', 'street': u'4 Yawkey Way'}

但是,如果相反,我尝试使用:

>>> p = json.loads(jstr, object_hook=as_person)

我明白了:

person found
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "C:\Program Files (x86)\Python27\lib\json\__init__.py", line 339, in loads
    return cls(encoding=encoding, **kw).decode(s)
  File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 382, in 
raw_decode
    obj, end = self.scan_once(s, idx)
  File "<interactive input>", line 5, in as_person
TypeError: string indices must be integers, not unicode

我不知道为什么会发生这种情况,并怀疑对于我错过的object_hook机制如何运作有一些微妙之处。

试图结合上述问题的概念,即object_hook从下往上评估每个嵌套字典(并在遍历中替换它?)我也尝试过:

def as_person2(jdict):
    if u'person' in jdict:
        print('person found')
        person = jdict[u'person']
        return Person2(name=person[u'name'], age=person[u'age'], address=person[u'address'])
    elif u'address' in jdict:
        print('address found')
        return Address(jdict[u'address'])
    else:
        return('person not found')
        return jdict

>>> json.loads(jstr, object_hook=as_person2)
address found
person found
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "C:\Program Files (x86)\Python27\lib\json\__init__.py", line 339, in loads
    return cls(encoding=encoding, **kw).decode(s)
  File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "<interactive input>", line 5, in as_person2
AttributeError: Address instance has no attribute '__getitem__'

所以,显然,object_hook函数的正确形式正在逃避我。

有人可以详细解释object_hook机制是如何工作的,以及如何从下到上递归地构造生成的对象树,为什么我的代码没有按预期工作,并修复我的示例或提供一个使用object_hook函数来构建一个复杂的类,假设你只获得一个object_hook函数?

2 个答案:

答案 0 :(得分:2)

通过实验,我回答了自己的问题;这可能不是最好的解决方案,我欢迎进一步的分析或更好的方法,但这揭示了object_hook过程的工作原理,因此对于面临同样问题的其他人来说可能是有益的。

关键的观察结果是,在json树遍历的每个级别,object_hook机制都希望你返回一个字典,所以如果你想将子代码更改为类实例,你必须用对象替换当前object_hook函数调用的输入字典,而不仅仅返回对象实例。

下面的解决方案允许自下而上的方法来构建对象层次结构。我已经插入了print语句,以显示在处理json字符串的子部分时如何调用load_hook,我觉得这很有启发性,对构建工作函数很有帮助。

import json
from pprint import pprint

jstr = '{"person":{ "name": "John Doe", "age": "46", \
         "address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'

class Address:
    def __init__(self, street=None, city=None, state=None):
        self.street=street
        self.city=city
        self.state = state
    def __repr__(self):
        return('Address(street={self.street!r}, city={self.city!r},' 
                         'state={self.state!r})'.format(self=self))

class Person:
    def __init__(self, name=None, age=None, address=None):
        self.name = name
        self.age = int(age)
        self.address=address
    def __repr__(self):
        return('Person(name={self.name!r}, age={self.age!r},\n'
               '       address={self.address!r})'.format(self=self))

def as_person4(jdict):
    if 'person' in jdict:
        print('person in jdict; (before substitution):')
        pprint(jdict)
        jdict['person'] = Person(**jdict['person'])
        print('after substitution:')
        pprint(jdict)
        print
        return jdict
    elif 'address' in jdict:
        print('address in jdict; (before substitution):'),
        pprint(jdict)
        jdict['address'] = Address(**jdict['address'])
        print('after substitution:')
        pprint(jdict)
        print
        return jdict
    else:
        print('jdict:')
        pprint(jdict)
        print
        return jdict

>>> p =json.loads(jstr, object_hook=as_person4)
jdict:
{u'city': u'Boston', u'state': u'MA', u'street': u'4 Yawkey Way'}

address in jdict; (before substitution):
{u'address': {u'city': u'Boston', u'state': u'MA', u'street': u'4 Yawkey Way'},
u'age': u'46', u'name': u'John Doe'}
after substitution:
{u'address': Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'),
u'age': u'46', u'name': u'John Doe'}

person in jdict; (before substitution):
{u'person': {u'address': Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'),
         u'age': u'46', u'name': u'John Doe'}}
after substitution:
{u'person': Person(name=u'John Doe', age=46,
   address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))}

>>> p
{u'person': Person(name=u'John Doe', age=46,
   address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))}
>>> 

请注意,返回的内容仍然是字典,其中键是&#39; person&#39;,并且值是Person对象(而不仅仅是Person对象),但此解决方案确实提供了可扩展的底部对象构造方法。

答案 1 :(得分:0)

我同意这是不直观的,但你可以简单地忽略当它不是你感兴趣的那种对象时传递的字典。这意味着这可能是最简单的方式:

(正如您所看到的,您也不需要所有这些u字符串前缀。)

import json

jstr = '{"person": { "name": "John Doe", "age": "46", \
           "address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'

class Address:
    def __init__(self, street=None, city=None, state=None):
        self.street = street
        self.city = city
        self.state = state

    def __repr__(self):  # optional - added so print displays something useful
        return('Address(street={self.street!r}, city={self.city!r}, '
               'state={self.state!r})'.format(self=self))

class Person:
    def __init__(self, name=None, age=None, address=None):
        self.name = name
        self.age = int(age)
        self.address = address

    def __repr__(self):  # optional - added so print displays something useful
        return('Person(name={self.name!r}, age={self.age!r},\n'
               '       address={self.address!r})'.format(self=self))

def as_person3(jdict):
    if 'person' not in jdict:
        return jdict
    else:
        person = jdict['person']
        address = Address(**person['address'])
        return Person(name=person['name'], age=person['age'], address=address)

p = json.loads(jstr, object_hook=as_person3)
print(p)

输出:

Person(name=u'John Doe', age=46,
       address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))