这个很简单,没有任何典型的答案可以冒出来。
我正在使用的工具的性质决定了我们使用MongoDB来存储'设置'大约25种不同的工具。每个工具都有自己的设置架构,因此每个文档都不同,但它们全部存储在同一个集合中,并在json架构绘制的同一编辑页面上进行编辑。
不知道字典的架构,我很难弄清楚如何迭代和消毒数据,特别是删除密码。
鉴于以下字典,并且知道其他字典可能有不同的模式,我怎么能遍历字典中的每一个项目并创建一个副本,相同的除了与任何键==&#34 ;密码"去掉?
所以:
{
"_enabled": true,
"instances": [
{
"isdefault": true,
"name": "dev",
"password": "abc123",
"url": "http://dev.example.com",
"user": "buffy"
},
{
"isdefault": false,
"name": "prod",
"password": "xxxxx",
"url": "http://prod.example.com",
"user": "spike"
},
{
"isdefault": false,
"name": "qa",
"password": "dasddf",
"url": "http://prod.example.com",
"user": "willow"
}
],
"label": "MyServers"
}
应该导致:
{
"_enabled": true,
"instances": [
{
"isdefault": true,
"name": "dev",
"url": "http://dev.example.com",
"user": "buffy"
},
{
"isdefault": false,
"name": "prod",
"url": "http://prod.example.com",
"user": "spike"
},
{
"isdefault": false,
"name": "qa",
"url": "http://prod.example.com",
"user": "willow"
}
],
"label": "MyServers"
}
答案 0 :(得分:3)
首先对dict进行深度检查,然后捕获所有词典并删除密码密钥:
from copy import deepcopy
def remove_pass(v):
if isinstance(v, dict):
if "password" in v:
del v["password"]
for ele in v.values():
remove_pass(ele)
elif isinstance(v, Iterable) and not isinstance(v, basestring):
for ele in v:
remove_pass(ele)
from pprint import pprint as pp
d = deepcopy(d)
for v in d.values():
remove_pass(v)
输入:
{'_enabled': 'true',
'foo': {'isdefault': 'false',
'name': 'qa',
'nested': {'password': 'nested'},
'password': 'dasddf',
'url': 'http://prod.example.com',
'user': 'willow'},
'instances': [{'isdefault': 'true',
'name': 'dev',
'password': 'abc123',
'url': 'http://dev.example.com',
'user': 'buffy'},
{'isdefault': 'false',
'name': 'prod',
nested': {'more_nesting': {'even_more_nesting': ({'password': 'foobar'},
{'password': 'foob'}),
'password': 'bar'},
'password': 'xxxxx',
'url': 'http://prod.example.com',
'user': 'spike'},
{'isdefault': 'false',
'name': 'qa',
'password': 'dasddf',
'url': 'http://prod.example.com',
'user': 'willow'}],
'label': 'MyServers'}
输出:
{'_enabled': 'true',
'foo': {'isdefault': 'false',
'name': 'qa',
'nested': {},
'url': 'http://prod.example.com',
'user': 'willow'},
'instances': [{'isdefault': 'true',
'name': 'dev',
'url': 'http://dev.example.com',
'user': 'buffy'},
{'isdefault': 'false',
'name': 'prod',
'nested': {'more_nesting': {'even_more_nesting': ({}, {})}},
'url': 'http://prod.example.com',
'user': 'spike'},
{'isdefault': 'false',
'name': 'qa',
'url': 'http://prod.example.com',
'user': 'willow'}],
'label': 'MyServers'}
答案 1 :(得分:0)
如果您知道每个数据的结构(即,在数组/字典的深度处期望"密码"键),这将很简单。您只需循环查看列表项和词典即可找到"密码"键。
如果每个设置字典的结构确实无法预测,那么您必须将解决方案混合在一起。我在这种情况下所做的是将我的JSON转储到字符串,使用正则表达式删除/隔离我感兴趣的数据,然后将字符串加载回结构化的JSON。
这样的事情:
导入json,重新
raw_data = """{
"_enabled": true,
"instances": [
{
"isdefault": true,
"name": "dev",
"password": "abc123",
"url": "http://dev.example.com",
"user": "buffy"
},
{
"isdefault": false,
"name": "prod",
"password": "xxxxx",
"url": "http://prod.example.com",
"user": "spike"
},
{
"isdefault": false,
"name": "qa",
"password": "dasddf",
"url": "http://prod.example.com",
"user": "willow"
}
],
"label": "MyServers"
}"""
# I load and then dump my raw_data just to iron out any inconsistencies
# in formatting before applying regex. i.e., inconsistent use of " instead of '
structured_data = json.loads(raw_data)
dumped_data = json.dumps(structured_data)
scrubbed = re.sub(r'"password": ".*?",', '', dumped_data)
structured_scrubbed = json.loads(scrubbed)
结果:
structured_scrubbed = {'_enabled': True,
'instances': [{'isdefault': True,
'name': 'dev',
'url': 'http://dev.example.com',
'user': 'buffy'},
{'isdefault': False,
'name': 'prod',
'url': 'http://prod.example.com',
'user': 'spike'},
{'isdefault': False,
'name': 'qa',
'url': 'http://prod.example.com',
'user': 'willow'}],
'label': 'MyServers'}
答案 2 :(得分:0)
我使用通用函数getPaths
来查找嵌套字典中特定键的路径。您可以使用它来查找“密码”密钥的所有路径,然后进行更改或删除。这适用于json字符串的所有格式/模式。
def getPaths(dictionary, searchKey):
'''
generator to get all paths for the key in the nested dictionary
'''
for k, v in dictionary.items():
if k == searchKey :
yield []
elif isinstance(v, dict):
# if the value if dict, go in recursively and yield the path
for subkey in getPaths(v, searchKey):
yield [k]+subkey
elif isinstance(v, list):
# if value is a list, for each element in the list, go in recursively and yield the path
for i, item in enumerate(v):
if isinstance(item, dict):
for subkey in getPaths(item, searchKey):
yield [k]+[i]+subkey
jsonstring = """{
"_enabled": true,
"instances": [
{
"isdefault": true,
"name": "dev",
"password": "abc123",
"url": "http://dev.example.com",
"user": "buffy"
},
{
"isdefault": false,
"name": "prod",
"password": "xxxxx",
"url": "http://prod.example.com",
"user": "spike"
},
{
"instance2": {
"isdefault": false,
"name": "qa",
"password": "dasddf",
"url": "http://prod.example.com",
"user": "willow"
}
}
],
"label": "MyServers"
}"""
import json
jsonObj = json.loads(jsonstring)
paths = getPaths(jsonObj , "password")
for path in paths:
print('path:', path)
结果:
>>> path: ['instances', 0]
>>> path: ['instances', 1]
>>> path: ['instances', 2, 'instance2']
答案 3 :(得分:0)
假设您只想检查列表或词典的容器,并从具有key = "password"
#first copy the structure
new_data = copy.deepcopy(data)
#this is a recursive function.
#Heavily nested structures may fail due to recursion limit
def clean_hierarchy(ele):
#lists may contain dictionaries, so clean theses entries
if isinstance(ele,list):
for val in ele:
clean_hierarchy(val)
if isinstance(ele,dict):
#remove possible password entry
if "password" in ele:
ele.pop("password",None)
#dictionary may contain more dictionaries. Rinse and repeat!
for val in ele.values():
clean_hierarchy(val)
clean_hierarchy(new_data)