我在正确完成这项工作时遇到了一些麻烦,但我的数据看起来像这样:
{
"completedProtocol": "Extract",
"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
],
"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]
}
我希望将其转换为:
[{"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]},
{"map":[
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]}]
到目前为止我的代码是:
import json
obj = json.loads(body)
newData = [dct for dct in obj if 'map' in dct]
但这只会返回:
[u'map']
如果我只在身体上使用json.loads
,它只返回map
的第二个值,覆盖第一个值。
注意:我想要一系列单项dicts;我不想要在一个密钥下一起收集值。
有什么想法吗?
答案 0 :(得分:1)
您可以使用自定义object_pairs_hook
函数强制json.loads()
返回单项dicts列表,而不是覆盖重复键的单个dict:
import json
def keep_duplicates(ordered_pairs):
result = []
for key, value in ordered_pairs:
result.append({key: value})
return result
来自docs:
object_pairs_hook 是一个可选函数,将使用 任何对象文字的结果用对的有序列表解码。 将使用 object_pairs_hook 的返回值代替
dict
。此功能可用于实现依赖的自定义解码器 按键和值对的顺序解码(例如,collections.OrderedDict()
将记住插入的顺序)。如果 object_hook 也被定义,object_pairs_hook优先。
用法:
>>> json.loads('{"a": 1, "a": 2, "a": 3}', object_pairs_hook=keep_duplicates)
[{u'a': 1}, {u'a': 2}, {u'a': 3}]
在您的情况下,由于您显然对除"map"
键之外的任何内容不感兴趣,您可以在之后过滤结果:
all_data = json.loads(body, object_pairs_hook=keep_duplicates)
map_data = [x for x in all_data if 'map' in x]
...它将为您提供问题中指定的结果。