我希望生成许多不同的JSON结构排列作为同一数据集的表示,最好不必对实现进行硬编码。例如,给定以下JSON:
{"name": "smith", "occupation": "agent", "enemy": "humanity", "nemesis": "neo"}`
应该产生许多不同的排列,例如:
{"name":"smith"}- > {"last_name":"smith"}
{"name":"...","occupation":"..."} -> {"occupation":"...", "name":"..."}
{"name":"...","occupation":"..."} -> "smith":{"occupation":"..."}
{"name":"...","occupation":"..."} -> "status": 200, "data":{"name":"...","occupation":"..."}
目前,实施情况如下:
我正在使用itertools.permutations
和OrderedDict()来查看可能的键和各自的值组合以及它们返回的顺序。
key_permutations = SchemaLike(...).permutate()
all_simulacrums = []
for key_permutation in key_permutations:
simulacrums = OrderedDict(key_permutation)
all_simulacrums.append(simulacrums)
for x in itertools.permutations(all_simulacrums.items()):
test_data = json.dumps(OrderedDict(p))
print(test_data)
assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)
当我尝试实现排列和模板的排列时,我的问题就出现了。 我不知道如何最好地实现这个功能,有什么建议吗?
答案 0 :(得分:5)
如需订购,只需使用有序的dicts:
>>> data = OrderedDict(foo='bar', bacon='eggs', bar='foo', eggs='bacon')
>>> for p in itertools.permutations(data.items()):
... test_data = json.dumps(OrderedDict(p))
... print(test_data)
... assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)
{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "bacon", "bar": "foo"}
{"foo": "bar", "bar": "foo", "bacon": "eggs", "eggs": "bacon"}
{"foo": "bar", "bar": "foo", "eggs": "bacon", "bacon": "eggs"}
{"foo": "bar", "eggs": "bacon", "bacon": "eggs", "bar": "foo"}
{"foo": "bar", "eggs": "bacon", "bar": "foo", "bacon": "eggs"}
{"bacon": "eggs", "foo": "bar", "bar": "foo", "eggs": "bacon"}
{"bacon": "eggs", "foo": "bar", "eggs": "bacon", "bar": "foo"}
{"bacon": "eggs", "bar": "foo", "foo": "bar", "eggs": "bacon"}
{"bacon": "eggs", "bar": "foo", "eggs": "bacon", "foo": "bar"}
{"bacon": "eggs", "eggs": "bacon", "foo": "bar", "bar": "foo"}
{"bacon": "eggs", "eggs": "bacon", "bar": "foo", "foo": "bar"}
{"bar": "foo", "foo": "bar", "bacon": "eggs", "eggs": "bacon"}
{"bar": "foo", "foo": "bar", "eggs": "bacon", "bacon": "eggs"}
{"bar": "foo", "bacon": "eggs", "foo": "bar", "eggs": "bacon"}
{"bar": "foo", "bacon": "eggs", "eggs": "bacon", "foo": "bar"}
{"bar": "foo", "eggs": "bacon", "foo": "bar", "bacon": "eggs"}
{"bar": "foo", "eggs": "bacon", "bacon": "eggs", "foo": "bar"}
{"eggs": "bacon", "foo": "bar", "bacon": "eggs", "bar": "foo"}
{"eggs": "bacon", "foo": "bar", "bar": "foo", "bacon": "eggs"}
{"eggs": "bacon", "bacon": "eggs", "foo": "bar", "bar": "foo"}
{"eggs": "bacon", "bacon": "eggs", "bar": "foo", "foo": "bar"}
{"eggs": "bacon", "bar": "foo", "foo": "bar", "bacon": "eggs"}
{"eggs": "bacon", "bar": "foo", "bacon": "eggs", "foo": "bar"}
相同的原则可以应用于键/值排列:
>>> for p in itertools.permutations(data.keys()):
...: test_data = json.dumps(OrderedDict(zip(p, data.values())))
...: print(test_data)
...:
{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "foo", "bar": "bacon"}
{"foo": "bar", "bar": "eggs", "bacon": "foo", "eggs": "bacon"}
{"foo": "bar", "bar": "eggs", "eggs": "foo", "bacon": "bacon"}
{"foo": "bar", "eggs": "eggs", "bacon": "foo", "bar": "bacon"}
{"foo": "bar", "eggs": "eggs", "bar": "foo", "bacon": "bacon"}
{"bacon": "bar", "foo": "eggs", "bar": "foo", "eggs": "bacon"}
{"bacon": "bar", "foo": "eggs", "eggs": "foo", "bar": "bacon"}
{"bacon": "bar", "bar": "eggs", "foo": "foo", "eggs": "bacon"}
{"bacon": "bar", "bar": "eggs", "eggs": "foo", "foo": "bacon"}
{"bacon": "bar", "eggs": "eggs", "foo": "foo", "bar": "bacon"}
{"bacon": "bar", "eggs": "eggs", "bar": "foo", "foo": "bacon"}
{"bar": "bar", "foo": "eggs", "bacon": "foo", "eggs": "bacon"}
{"bar": "bar", "foo": "eggs", "eggs": "foo", "bacon": "bacon"}
{"bar": "bar", "bacon": "eggs", "foo": "foo", "eggs": "bacon"}
{"bar": "bar", "bacon": "eggs", "eggs": "foo", "foo": "bacon"}
{"bar": "bar", "eggs": "eggs", "foo": "foo", "bacon": "bacon"}
{"bar": "bar", "eggs": "eggs", "bacon": "foo", "foo": "bacon"}
{"eggs": "bar", "foo": "eggs", "bacon": "foo", "bar": "bacon"}
{"eggs": "bar", "foo": "eggs", "bar": "foo", "bacon": "bacon"}
{"eggs": "bar", "bacon": "eggs", "foo": "foo", "bar": "bacon"}
{"eggs": "bar", "bacon": "eggs", "bar": "foo", "foo": "bacon"}
{"eggs": "bar", "bar": "eggs", "foo": "foo", "bacon": "bacon"}
{"eggs": "bar", "bar": "eggs", "bacon": "foo", "foo": "bacon"}
依此类推......如果您不需要所有组合,您可以使用一组预定义的键/值。您还可以使用for
循环random.choice
来翻转硬币以跳过某些组合,或使用random.shuffle
冒着重复组合的风险。
对于模板,我猜你必须创建一个不同模板的列表(或列表列表,如果你想要嵌套结构),然后迭代它以创建数据。为了给出更好的建议,我们需要对您想要的内容进行更严格的规范。
请注意,有几个库在Python中生成测试数据:
>>> from faker import Faker
>>> faker = Faker()
>>> faker.credit_card_full().strip().split('\n')
['VISA 13 digit', 'Jerry Gutierrez', '4885274641760 04/24', 'CVC: 583']
Faker有几个模式,很容易创建自己的自定义虚假数据提供程序。
答案 1 :(得分:4)
由于已经回答了dict顺序的shuffle,我将跳过它。
当我想到新事物时,我会加入这个答案。
from random import randint
from collections import OrderedDict
#Randomly shuffles the key-value pairs of a dictionary
def random_dict_items(input_dict):
items = input_dict.items()
new_dict = OrderedDict()
for i in items:
rand = randint(0, 1)
if rand == 0:
new_dict[i[0]] = i[1]
else:
new_dict[i[1]] = i[0]
return new_dict