我无法理解json.loads()中的object_hook功能是如何工作的。我在这里发现了一个类似的问题object_hook does not address the full json,但我已经尝试按照我的理解,并且它仍然不适合我。我已经收集到了object_hook函数以某种方式递归调用,但是我没有理解如何使用它来从json字符串构造复杂的对象层次结构。请考虑以下json字符串,类和object_hook函数:
import json
from pprint import pprint
jstr = '{"person":{ "name": "John Doe", "age": "46", \
"address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'
class Address:
def __init__(self, street=None, city=None, state=None):
self.street = street
self.city = city
self.state = state
class Person:
def __init__(self, name=None, age=None, address=None):
self.name = name
self.age = int(age)
self.address = Address(**address)
def as_person(jdict):
if u'person' in jdict:
print('person found')
person = jdict[u'person']
return Person(name=person[u'name'], age=person[u'age'],
address=person[u'address'])
else:
return('person not found')
return jdict
(我使用关键字args定义类以提供默认值,以便json不需要包含所有元素,我仍然可以确保属性存在于类实例中。我最终还会将方法与类关联,但是需要从json数据填充实例。)
如果我跑:
>>> p = as_person(json.loads(jstr))
我得到了我的期望,即:
person found
和p成为Person对象,即:
>>> pprint(p.__dict__)
{'address': <__main__.Address instance at 0x0615F3C8>,
'age': 46,
'name': u'John Doe'}
>>> pprint(p.address.__dict__)
{'city': u'Boston', 'state': u'MA', 'street': u'4 Yawkey Way'}
但是,如果相反,我尝试使用:
>>> p = json.loads(jstr, object_hook=as_person)
我明白了:
person found
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Program Files (x86)\Python27\lib\json\__init__.py", line 339, in loads
return cls(encoding=encoding, **kw).decode(s)
File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 382, in
raw_decode
obj, end = self.scan_once(s, idx)
File "<interactive input>", line 5, in as_person
TypeError: string indices must be integers, not unicode
我不知道为什么会发生这种情况,并怀疑对于我错过的object_hook机制如何运作有一些微妙之处。
试图结合上述问题的概念,即object_hook从下往上评估每个嵌套字典(并在遍历中替换它?)我也尝试过:
def as_person2(jdict):
if u'person' in jdict:
print('person found')
person = jdict[u'person']
return Person2(name=person[u'name'], age=person[u'age'], address=person[u'address'])
elif u'address' in jdict:
print('address found')
return Address(jdict[u'address'])
else:
return('person not found')
return jdict
>>> json.loads(jstr, object_hook=as_person2)
address found
person found
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Program Files (x86)\Python27\lib\json\__init__.py", line 339, in loads
return cls(encoding=encoding, **kw).decode(s)
File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files (x86)\Python27\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
File "<interactive input>", line 5, in as_person2
AttributeError: Address instance has no attribute '__getitem__'
所以,显然,object_hook函数的正确形式正在逃避我。
有人可以详细解释object_hook机制是如何工作的,以及如何从下到上递归地构造生成的对象树,为什么我的代码没有按预期工作,并修复我的示例或提供一个使用object_hook函数来构建一个复杂的类,假设你只获得一个object_hook函数?
答案 0 :(得分:2)
通过实验,我回答了自己的问题;这可能不是最好的解决方案,我欢迎进一步的分析或更好的方法,但这揭示了object_hook过程的工作原理,因此对于面临同样问题的其他人来说可能是有益的。
关键的观察结果是,在json树遍历的每个级别,object_hook机制都希望你返回一个字典,所以如果你想将子代码更改为类实例,你必须用对象替换当前object_hook函数调用的输入字典值,而不仅仅返回对象实例。
下面的解决方案允许自下而上的方法来构建对象层次结构。我已经插入了print语句,以显示在处理json字符串的子部分时如何调用load_hook,我觉得这很有启发性,对构建工作函数很有帮助。
import json
from pprint import pprint
jstr = '{"person":{ "name": "John Doe", "age": "46", \
"address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'
class Address:
def __init__(self, street=None, city=None, state=None):
self.street=street
self.city=city
self.state = state
def __repr__(self):
return('Address(street={self.street!r}, city={self.city!r},'
'state={self.state!r})'.format(self=self))
class Person:
def __init__(self, name=None, age=None, address=None):
self.name = name
self.age = int(age)
self.address=address
def __repr__(self):
return('Person(name={self.name!r}, age={self.age!r},\n'
' address={self.address!r})'.format(self=self))
def as_person4(jdict):
if 'person' in jdict:
print('person in jdict; (before substitution):')
pprint(jdict)
jdict['person'] = Person(**jdict['person'])
print('after substitution:')
pprint(jdict)
print
return jdict
elif 'address' in jdict:
print('address in jdict; (before substitution):'),
pprint(jdict)
jdict['address'] = Address(**jdict['address'])
print('after substitution:')
pprint(jdict)
print
return jdict
else:
print('jdict:')
pprint(jdict)
print
return jdict
>>> p =json.loads(jstr, object_hook=as_person4)
jdict:
{u'city': u'Boston', u'state': u'MA', u'street': u'4 Yawkey Way'}
address in jdict; (before substitution):
{u'address': {u'city': u'Boston', u'state': u'MA', u'street': u'4 Yawkey Way'},
u'age': u'46', u'name': u'John Doe'}
after substitution:
{u'address': Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'),
u'age': u'46', u'name': u'John Doe'}
person in jdict; (before substitution):
{u'person': {u'address': Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'),
u'age': u'46', u'name': u'John Doe'}}
after substitution:
{u'person': Person(name=u'John Doe', age=46,
address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))}
>>> p
{u'person': Person(name=u'John Doe', age=46,
address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))}
>>>
请注意,返回的内容仍然是字典,其中键是&#39; person&#39;,并且值是Person对象(而不仅仅是Person对象),但此解决方案确实提供了可扩展的底部对象构造方法。
答案 1 :(得分:0)
我同意这是不直观的,但你可以简单地忽略当它不是你感兴趣的那种对象时传递的字典。这意味着这可能是最简单的方式:
(正如您所看到的,您也不需要所有这些u
字符串前缀。)
import json
jstr = '{"person": { "name": "John Doe", "age": "46", \
"address": {"street": "4 Yawkey Way", "city": "Boston", "state": "MA"} } }'
class Address:
def __init__(self, street=None, city=None, state=None):
self.street = street
self.city = city
self.state = state
def __repr__(self): # optional - added so print displays something useful
return('Address(street={self.street!r}, city={self.city!r}, '
'state={self.state!r})'.format(self=self))
class Person:
def __init__(self, name=None, age=None, address=None):
self.name = name
self.age = int(age)
self.address = address
def __repr__(self): # optional - added so print displays something useful
return('Person(name={self.name!r}, age={self.age!r},\n'
' address={self.address!r})'.format(self=self))
def as_person3(jdict):
if 'person' not in jdict:
return jdict
else:
person = jdict['person']
address = Address(**person['address'])
return Person(name=person['name'], age=person['age'], address=address)
p = json.loads(jstr, object_hook=as_person3)
print(p)
输出:
Person(name=u'John Doe', age=46,
address=Address(street=u'4 Yawkey Way', city=u'Boston', state=u'MA'))