我一直在查看以下问题的答案:How can I select deeply nested key:values from dictionary in python
但我的问题不是在深层嵌套数据结构中找到一个键,而是在特定键的所有出现。
例如,就像我们在这里修改第一个例子中的数据结构一样:
[
"stats":{
"success": true,
"payload": {
"tag": {
"slug": "python",
"name": "Python",
"postCount": 10590,
"virtuals": {
"isFollowing": false
}
},
"metadata": {
"followerCount": 18053,
"postCount": 10590,
"coverImage": {
"id": "1*O3-jbieSsxcQFkrTLp-1zw.gif",
"originalWidth": 550,
"originalHeight": 300
}
}
}
},
"stats": {
"success": true,
"payload": {
"tag": {
"slug": "python",
"name": "Python",
"postCount": 10590,
"virtuals": {
"isFollowing": false
}
},
"metadata": {
"followerCount": 18053,
"postCount": 10590,
"coverImage": {
"id": "1*O3-jbieSsxcQFkrTLp-1zw.gif",
"originalWidth": 550,
"originalHeight": 300
}
}
}
}
]
我如何在这里获得所有可能出现的“元数据”?
答案 0 :(得分:3)
递归的事情怎么样?
def extractVals(obj, key, resList):
if type(obj) == dict:
if key in obj:
resList.append(obj[key])
for k, v in obj.items():
extractVals(v, key, resList)
if type(obj) == list:
for l in obj:
extractVals(l, key, resList)
resultList1 = []
extractVals(dat, 'metadata', resultList1)
print(resultList1)
的产率:
[{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590},
{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590}]
我还必须稍微修改您的数据集,使其成为有效的Python结构。 true
- > True
,false
- > False
,并从顶级列表中删除了密钥。
答案 1 :(得分:0)
您可以使用类似这样的custon类:
class DeepDict:
def __init__(self, data):
self.data = data
@classmethod
def _deep_find(cls, data, key, root, response):
if root:
root += "."
if isinstance(data, list):
for i, item in enumerate(data):
cls._deep_find(item, key, root + str(i), response)
elif isinstance(data, dict):
if key in data:
response.append(root + key)
for data_key, value in data.items():
cls._deep_find(value, key, root + data_key, response)
return response
def deep_find(self, key):
""" Returns all ocurrences of `key` with a dottedpath leading to each.
Use `deepget` to retrieve the values for a given ocurrence, or
`get_all` to iterate over the values for each occurrence of the key.
"""
return self._deep_find(self.data, key, root="", response=[])
@classmethod
def _deep_get(cls, data, path):
if not path:
return data
index = path.pop(0)
if index.isdigit():
index = int(index)
return cls._deep_get(data[index], path)
def deep_get(self, path):
if isinstance(path, str):
path = path.split(".")
return self._deep_get(self.data, path)
def get_all(self, key):
for path in self.deep_find(key):
yield self.deep_get(path)
def __getitem__(self, key):
if key.isdigit():
key = int(key)
return self.data[key]
(请注意,尽管我将其命名为#34; DeepDict"它实际上是一个通用的JSON容器,它可以同时使用列表和dicts作为外部元素。顺便说一下,你问题中的JSON片段都被破坏了 - 两个{ {1}}密钥应包含在额外的"stats":
)
所以,这三种自定义方法可以找到精确的"路径"对于键的每次出现,或者,您可以使用{ }
方法简单地获取具有该名称的键在结构中作为迭代器的内容。
使用上面的课程,修复数据后我做了:
get_all
得到了输出:
data = DeepDict(<data structure above (fixed)>)
list(data.get_all("metadata"))