从嵌套的python字典生成所有组合并隔离它们

时间:2016-03-29 13:26:38

标签: python dictionary nested combinations permutation

我的样本字典是:

sample_dict = {
    'company': {
        'employee': {
            'name': [
                {'explore': ["noname"],
                 'valid': ["john","tom"],
                 'boundary': ["aaaaaaaaaa"],
                 'negative': ["$"]}],
            'age': [
                {'explore': [200],
                 'valid': [20,30],
                 'boundary': [1,99],
                 'negative': [-1,100]}],
            'others':{
                'grade':[
                    {'explore': ["star"],
                     'valid': ["A","B"],
                     'boundary': ["C"],
                     'negative': ["AB"]}]}
    }
}}

它是"后续"问题 - > Split python dictionary to result in all combinations of values
我想获得一个隔离的组合列表,如下所示
有效组合:[仅生成有效的数据列表]
有效类别的完整输出:

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'B'}}} 
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'B'}}}

负组合:[这里有点棘手,因为负组合应与&#34结合;有效"游泳池以及至少只有价值为负的情况。
NEGATIVE类别的完整输出:
=> [基本上,排除所有值都有效的组合 - 确保组合中的至少一个值来自负组]

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 100}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': 100}, 'name': '$', 'others': {'grade': 'AB'}}}

在上面的输出中,在第一行中,通过保持全部有效来测试等级AB的负值。因此,没有必要生成年龄为30的相同,因为目的是仅测试负集。我们可以提供剩余参数和任何有效数据。


边界组合类似于有效 - >仅边界池内所有值的组合
探索:类似于否定 - 与有效池混合,并且在所有组合中始终至少有一个探索值
样本字典 - 修订版本

sample_dict2 = {
    'company': {
        'employee_list': [
            {'employee': {'age': [{'boundary': [1,99],
                                   'explore': [200],
                                   'negative': [-1,100],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': ['aaaaaaaaaa'],
                                    'explore': ['noname'],
                                    'negative': ['$'],
                                    'valid': ['john','tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']},
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']}]}}},
            {'employee': {'age': [{'boundary': [1, 99],
                                   'explore': [200],
                                   'negative': [],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': [],
                                    'explore': [],
                                    'negative': ['$'],
                                    'valid': ['john', 'tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': [],
                                   'valid': ['A', 'B']},
                                  {'boundary': [],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A', 'B']}]}}}
        ]
    }
}

sample_dict2包含dicts列表。在这里"员工"整个层次结构是一个列表元素,也是叶节点" grade"是一个清单
此外,除了"有效"和"边界"其他数据集可以为空 - []我们也需要处理它们 有效组合就像是

{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}

在员工索引0中加上age = 30和name = tom的组合

2 个答案:

答案 0 :(得分:4)

import itertools

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            return thing[positive] if negative is None else [thing[positive][0]] + thing[negative]
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                subresults = []
                for result in generate_combinations(value, positive, negative):
                    subresults.append((key, result))
                results.append(subresults)
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, list) or isinstance(thing, tuple):  # added tuple just to be safe
        results = []
        for element in thing:  # generate recursive result sets for each element of list
            for result in generate_combinations(element, positive, negative):
                results.append(result)
        return results

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_invalid_combinations(thing):

    """ Generate all possible combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='negative') if result not in valid]


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    return generate_combinations(thing, positive="boundary")


def generate_explore_combinations(thing):

    """ Generate all possible explore combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='explore') if result not in valid]

致电generate_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'B'}}}}
]

致电generate_invalid_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'AB'}}}}
]

调用generate_boundary_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 1, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}},
{'company': {'employee': {'age': 99, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}}
]

调用generate_explore_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'star'}}}}
]

修订后的解决方案(以匹配修订后的问题)

import itertools
import random

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            if negative is None:
                return thing[positive]  # here it's OK if it's empty
            elif thing[positive]:  # here it's not OK if it's empty
                return [random.choice(thing[positive])] + thing[negative]
            else:
                return []
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                results.append([(key, result) for result in generate_combinations(value, positive, negative)])
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, (list, tuple)):  # added tuple just to be safe (thanks Padraic!)
        # generate recursive result sets for each element of list
        results = [generate_combinations(element, positive, negative) for element in thing]
        return [list(result) for result in itertools.product(*results)]

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='boundary') if result not in valid]

generate_invalid_combinations()generate_explore_combinations()与以前相同。微妙的差异:

它不是在负面评估中从有效数组中抓取第一个项目,而是从有效数组中抓取一个随机项目。

'age': [30]这样的项目的值会以列表的形式返回,如下所示:

'age': [{'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]}],

如果您希望'age': 30像早期的输出示例一样,那么相应地修改定义:

'age': {'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]},

现在,边界属性被视为“负”值之一。

仅供参考,我不打算这次生成所有输出:调用generate_combinations(sample_dict2)会返回如下结果:

[
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
...
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}]}}
]

答案 1 :(得分:0)

这是一个开放式的大黄蜂问题的巢穴。

  1. 查看Agitar的Agitar其他工具的白皮书,看看你是否正在思考这个问题。

  2. 看看Knuth关于combinationals的工作。这是一个艰难的阅读。

  3. 考虑编写一个使用'yield'的递归下降生成器。