如何优雅地检查字典是否具有给定的结构?

时间:2017-05-06 13:46:19

标签: python dictionary

我有一个具有以下结构的字典:

D = {
   'rows': 11,
   'cols': 13,
   (i, j): {
              'meta': 'random string',
              'walls': {
                  'E': True,
                  'O': False,
                  'N': True,
                  'S': True
              }
           }
}
# i ranging in {0 .. D['rows']-1}
# j ranging in {0 .. D['cols']-1}

我被要求编写一个函数,该函数将任意对象作为参数并检查它是否具有该结构。这就是我写的:

def well_formed(L):
    if type(L) != dict:
        return False
    if 'rows' not in L:
        return False
    if 'cols' not in L:
        return False

    nr, nc = L['rows'], L['cols']

    # I should also check the int-ness of nr and nc ...

    if len(L) != nr*nc + 2:
        return False

    for i in range(nr):
        for j in range(nc):
            if not ((i, j) in L
                and 'meta' in L[i, j]
                and  'walls' in L[i, j]
                and type(L[i, j]['meta']) == str
                and type(L[i, j]['walls'])  == dict
                and     'E' in L[i, j]['walls']
                and     'N' in L[i, j]['walls']
                and     'O' in L[i, j]['walls']
                and     'S' in L[i, j]['walls']
                and type(L[i, j]['walls']['E']) == bool
                and type(L[i, j]['walls']['N']) == bool
                and type(L[i, j]['walls']['O']) == bool
                and type(L[i, j]['walls']['S']) == bool):
                return False

    return True

虽然它有效,但我根本不喜欢它。是否有Pythonic方法可以做到这一点?

我只允许使用标准库。

6 个答案:

答案 0 :(得分:16)

首先,我认为更多'Pythonic'可能是请求宽恕而不是许可 - 检查您何时需要一个属性,无论数据结构是否具有该属性。

但另一方面,如果你被要求创造一些东西来检查它是否格式正确,那就无济于事。 :)

因此,如果需要检查,可以使用the schema library之类的内容来定义数据结构的外观,然后根据该模式检查其他数据结构。

答案 1 :(得分:7)

在Python中,所涉及类型的确切身份不如值的行为重要。对于这样一个对象的已定义的使用,该对象是否足够?这意味着L不一定是dict,只是支持__getitem__; L[(i,j)]['meta']不一定是str,只需通过str(L[(i,j)]['meta'])支持转换到字符串;等

考虑到放松,我只会尝试捕获尝试此类操作引起的任何错误,如果发生任何错误,则返回False。例如,

def well_formed(L):
    try:
        nr = L['rows']
        nc = L['cols']
    except KeyError:
        return False

    try:
        for i in range(nr):
            for j in range(nc):
                str(L[(i,j)]['meta'])
                walls = L[(i,j)]['walls']
                for direction in walls:
                    # Necessary?
                    if direction not in "ENOS":
                        return False
                    if walls[direction] not in (True, False):
                        return False
    except KeyError:
        return False

    return True

鉴于任何对象都有一个布尔值,尝试bool(walls[direction])似乎没有意义;相反,如果确切地将TrueFalse作为一个值并不是一个硬性要求,那么我只会对该值进行任何测试。同样地,额外的墙可能是也可能不是问题,并且不必明确地进行测试。

答案 2 :(得分:4)

您可以像这样编写验证(来自Scala提取器的想法)。优点是验证器结构类似于要测试的结构。

缺点是许多函数调用可能会使它慢得多。

class Mapping:
    def __init__(self, **kwargs):
        self.key_values = [KeyValue(k, v) for k, v in kwargs.items()]

    def validate(self, to_validate):
        if not isinstance(to_validate, dict):
            return False

        for validator in self.key_values:
            if not validator.validate(to_validate):
                return False
        return True


class KeyValue:
    def __init__(self, key, value):
        self.key = key
        self.value = value

    def validate(self, to_validate):
        return self.key in to_validate and self.value.validate(to_validate[self.key])


class Boolean:
    def validate(self, to_validate):
        return isinstance(to_validate, bool)


class Integer:
    def validate(self, to_validate):
        return isinstance(to_validate, int)


class String:
    def validate(self, to_validate):
        return isinstance(to_validate, str)


class CustomValidator:
    def validate(self, to_validate):
        if not Mapping(rows=Integer(), cols=Integer()).validate(to_validate):
            return False
        element_validator = Mapping(meta=String(), walls=Mapping(**{k: Boolean() for k in "EONS"}))
        for i in range(to_validate['rows']):
            for j in range(to_validate['cols']):
                if not KeyValue((i, j), element_validator).validate(to_validate):
                    return False
        return True


d = {
    'rows': 11,
    'cols': 13,
}
d.update({(i, j): {
    'meta': 'random string',
    'walls': {
        'E': True,
        'O': False,
        'N': True,
        'S': True
    }
} for i in range(11) for j in range(13)})

assert CustomValidator().validate(d)

覆盖isinstance(使用Python 3.5测试)

同样如此
class IsInstanceCustomMeta(type):
    def __instancecheck__(self, instance):
        return self.validate(instance)

def create_custom_isinstance_class(f):
    class IsInstanceCustomClass(metaclass=IsInstanceCustomMeta):
        validate = f
    return IsInstanceCustomClass

def Mapping(**kwargs):
    key_values = [KeyValue(k, v) for k, v in kwargs.items()]

    def validate(to_validate):
        if not isinstance(to_validate, dict):
            return False

        for validator in key_values:
            if not isinstance(to_validate, validator):
                return False
        return True

    return create_custom_isinstance_class(validate)

def KeyValue(key, value):
    return create_custom_isinstance_class(lambda to_validate: key in to_validate and isinstance(to_validate[key], value))

def my_format_validation(to_validate):
    if not isinstance(to_validate, Mapping(rows=int, cols=int)):
        return False
    element_validator = Mapping(meta=str, walls=Mapping(**{k: bool for k in "EONS"}))
    for i in range(to_validate['rows']):
        for j in range(to_validate['cols']):
            if not isinstance(to_validate, KeyValue((i, j), element_validator)):
                return False
    return True

MyFormat = create_custom_isinstance_class(my_format_validation)

d = {
    'rows': 11,
    'cols': 13,
}
d.update({(i, j): {
    'meta': 'random string',
    'walls': {
        'E': True,
        'O': False,
        'N': True,
        'S': True
    }
} for i in range(11) for j in range(13)})

assert isinstance(d, MyFormat)

答案 3 :(得分:3)

如果您的格式更简单,我同意其他答案/评论,以使用现有的架构验证库,例如schemavoluptuous。但是,考虑到你必须检查带有元组键的字典的特定情况,以及那些元组'值取决于你的dict的其他成员的值,我认为你最好编写自己的验证器,而不是试图哄骗架构以适应你的格式。

答案 4 :(得分:1)

from itertools import product

def isvalid(d):
    try:
        for key in product(range(d['rows']), range(d['cols'])):
            sub = d[key]
            assert (isinstance(sub['meta'], str) and
                    all(isinstance(sub['walls'][c], bool)
                        for c in 'EONS'))
    except (KeyError, TypeError, AssertionError):
        return False
    return True

如果Python 2兼容性很重要或者必须断言没有其他密钥,请告诉我。

答案 5 :(得分:0)

如果你使用这样的东西:

def get_deep_keys(d, depth=2):
    """Gets a representation of all dictionary keys to a set depth.

    If a (sub)dictionary contains all non-dictionary values, a list of keys
    will be returned.

    If a dictionary contains a mix of types, a dictionary of dicts/lists/types
    will be returned.
    """
    if isinstance(d, dict):
        if depth > 0 and any(isinstance(v, dict) for v in d.values()):
            return {k: get_deep_keys(v, depth=depth - 1) for k, v in d.items()}
        else:
            return set(d.keys())

    else:
        return type(d)

然后你可以这样做:

assert get_deep_keys(D[i, j]) == {
    'meta': str, 'walls': {'E', 'N', 'O', 'S'}}

i, j 上的循环内。这很容易修改以返回底层元素的类型:

def get_deep_keys(d, depth=2):
    """Gets a representation of all dictionary keys to a set depth, with types.
    """
    if isinstance(d, dict) and depth > 0:
        return {k: get_deep_keys(v, depth=depth - 1) for k, v in d.items()}

    return type(d)


get_deep_keys(D)
 # {'meta': str, 'walls': {'E': bool, 'O': bool, 'N': bool, 'S': bool}}