通过键列表访问嵌套的字典项?

时间:2013-02-04 18:04:35

标签: python list dictionary

我有一个复杂的字典结构,我想通过一个键列表来访问正确的项目。

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}    

maplist = ["a", "r"]

maplist = ["b", "v", "y"]

我已经制作了以下代码,但我确信如果有人有想法,有更好,更有效的方法。

# Get a given data from a dictionary with position provided as a list
def getFromDict(dataDict, mapList):    
    for k in mapList: dataDict = dataDict[k]
    return dataDict

# Set a given data in a dictionary with position provided as a list
def setInDict(dataDict, mapList, value): 
    for k in mapList[:-1]: dataDict = dataDict[k]
    dataDict[mapList[-1]] = value

19 个答案:

答案 0 :(得分:177)

使用reduce()遍历字典:

from functools import reduce  # forward compatibility for Python 3
import operator

def getFromDict(dataDict, mapList):
    return reduce(operator.getitem, mapList, dataDict)

并重复使用getFromDict来查找存储setInDict()的值的位置:

def setInDict(dataDict, mapList, value):
    getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value

需要除mapList中的最后一个元素以找到要添加值的'父'字典,然后使用最后一个元素将值设置为右键。

演示:

>>> getFromDict(dataDict, ["a", "r"])
1
>>> getFromDict(dataDict, ["b", "v", "y"])
2
>>> setInDict(dataDict, ["b", "v", "w"], 4)
>>> import pprint
>>> pprint.pprint(dataDict)
{'a': {'r': 1, 's': 2, 't': 3},
 'b': {'u': 1, 'v': {'w': 4, 'x': 1, 'y': 2, 'z': 3}, 'w': 3}}

请注意Python PEP8样式指南prescribes snake_case names for functions。上述内容同样适用于列表或字典和列表的混合,因此名称应该是get_by_path()set_by_path()

from functools import reduce  # forward compatibility for Python 3
import operator

def get_by_path(root, items):
    """Access a nested object in root by item sequence."""
    return reduce(operator.getitem, items, root)

def set_by_path(root, items, value):
    """Set a value in a nested object in root by item sequence."""
    get_by_path(root, items[:-1])[items[-1]] = value

答案 1 :(得分:30)

  1. 接受的解决方案不能直接用于python3 - 它需要from functools import reduce
  2. 使用for循环似乎更加pythonic。 请参阅What’s New In Python 3.0的引用。
      

    已移除reduce()。如果你确实需要,请使用functools.reduce();但是,99%的时间显式for循环更具可读性。

  3. 接下来,接受的解决方案不会设置不存在的嵌套密钥(它返回KeyError) - 请参阅@ eafit的解决方案答案
  4. 那么为什么不使用kolergy的问题中建议的方法来获取值:

    def getFromDict(dataDict, mapList):    
        for k in mapList: dataDict = dataDict[k]
        return dataDict
    

    来自@ eafit的代码设置值的代码:

    def nested_set(dic, keys, value):
        for key in keys[:-1]:
            dic = dic.setdefault(key, {})
        dic[keys[-1]] = value
    

    两者都在python 2和3中直接工作

答案 2 :(得分:13)

使用reduce非常聪明,但如果父键在嵌套字典中不存在,则OP的set方法可能会出现问题。由于这是我在谷歌搜索中看到的第一篇关于这个主题的SO帖子,我想稍微好一点。

Setting a value in a nested python dictionary given a list of indices and value)中的set方法似乎对丢失的父母键更加健壮。复制它:

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

此外,有一个方法可以方便地遍历密钥树并获取我创建的所有绝对密钥路径:

def keysInDict(dataDict, parent=[]):
    if not isinstance(dataDict, dict):
        return [tuple(parent)]
    else:
        return reduce(list.__add__, 
            [keysInDict(v,parent+[k]) for k,v in dataDict.items()], [])

它的一个用途是使用以下代码将嵌套树转换为pandas DataFrame(假设嵌套字典中的所有叶子具有相同的深度)。

def dict_to_df(dataDict):
    ret = []
    for k in keysInDict(dataDict):
        v = np.array( getFromDict(dataDict, k), )
        v = pd.DataFrame(v)
        v.columns = pd.MultiIndex.from_product(list(k) + [v.columns])
        ret.append(v)
    return reduce(pd.DataFrame.join, ret)

答案 3 :(得分:8)

此库可能会有所帮助:https://github.com/akesterson/dpath-python

  

用于访问和搜索字典的python库   / slashed / paths ala xpath

     

基本上它可以让你在字典上覆盖它,好像它是一个字典   文件系统。

答案 4 :(得分:2)

如何使用递归函数?

获取值:

def getFromDict(dataDict, maplist):
    first, rest = maplist[0], maplist[1:]

    if rest: 
        # if `rest` is not empty, run the function recursively
        return getFromDict(dataDict[first], rest)
    else:
        return dataDict[first]

设置值:

def setInDict(dataDict, maplist, value):
    first, rest = maplist[0], maplist[1:]

    if rest:
        try:
            if not isinstance(dataDict[first], dict):
                # if the key is not a dict, then make it a dict
                dataDict[first] = {}
        except KeyError:
            # if key doesn't exist, create one
            dataDict[first] = {}

        setInDict(dataDict[first], rest, value)
    else:
        dataDict[first] = value

答案 5 :(得分:2)

纯Python风格,没有任何导入:

def nested_set(element, value, *keys):
    if type(element) is not dict:
        raise AttributeError('nested_set() expects dict as first argument.')
    if len(keys) < 2:
        raise AttributeError('nested_set() expects at least three arguments, not enough given.')

    _keys = keys[:-1]
    _element = element
    for key in _keys:
        _element = _element[key]
    _element[keys[-1]] = value

example = {"foo": { "bar": { "baz": "ok" } } }
keys = ['foo', 'bar']
nested_set(example, "yay", *keys)
print(example)

输出

{'foo': {'bar': 'yay'}}

答案 6 :(得分:1)

如果您不想在其中一个密钥缺失的情况下(如果您的主代码可以不间断地运行)而不想提出错误,那么另一种方法是:

def get_value(self,your_dict,*keys):
    curr_dict_ = your_dict
    for k in keys:
        v = curr_dict.get(k,None)
        if v is None:
            break
        if isinstance(v,dict):
            curr_dict = v
    return v

在这种情况下,如果没有任何输入键,则返回None,这可用作主代码中的检查以执行替代任务。

答案 7 :(得分:1)

我建议您使用python-benedict通过键路径访问嵌套项目。

使用pip安装它:

pip install python-benedict

然后:

from benedict import benedict

dataDict = benedict({
    "a":{
        "r": 1,
        "s": 2,
        "t": 3,
    },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3,
        },
        "w": 3,
    },
}) 

print(dataDict['a.r'])
# or
print(dataDict['a', 'r'])

这里是完整的文档: https://github.com/fabiocaccamo/python-benedict

答案 8 :(得分:1)

晚会很晚,但是发帖以防将来对某人有帮助。对于我的用例,以下功能效果最好。可以将任何数据类型从字典中拉出

dict 是包含我们的价值的字典

列表是实现我们价值的“步骤”列表

def getnestedvalue(dict, list):

    length = len(list)
    try:
        for depth, key in enumerate(list):
            if depth == length - 1:
                output = dict[key]
                return output
            dict = dict[key]
    except (KeyError, TypeError):
        return None

    return None

答案 9 :(得分:1)

如何检查然后设置dict元素而不处理所有索引两次?

解决方案:

def nested_yield(nested, keys_list):
    """
    Get current nested data by send(None) method. Allows change it to Value by calling send(Value) next time
    :param nested: list or dict of lists or dicts
    :param keys_list: list of indexes/keys
    """
    if not len(keys_list):  # assign to 1st level list
        if isinstance(nested, list):
            while True:
                nested[:] = yield nested
        else:
            raise IndexError('Only lists can take element without key')


    last_key = keys_list.pop()
    for key in keys_list:
        nested = nested[key]

    while True:
        try:
            nested[last_key] = yield nested[last_key]
        except IndexError as e:
            print('no index {} in {}'.format(last_key, nested))
            yield None

工作流程示例:

ny = nested_yield(nested_dict, nested_address)
data_element = ny.send(None)
if data_element:
    # process element
    ...
else:
    # extend/update nested data
    ny.send(new_data_element)
    ...
ny.close()

测试

>>> cfg= {'Options': [[1,[0]],[2,[4,[8,16]]],[3,[9]]]}
    ny = nested_yield(cfg, ['Options',1,1,1])
    ny.send(None)
[8, 16]
>>> ny.send('Hello!')
'Hello!'
>>> cfg
{'Options': [[1, [0]], [2, [4, 'Hello!']], [3, [9]]]}
>>> ny.close()

答案 10 :(得分:1)

每次想要查找某个值时都不会影响性能,而是将字体展平一次,然后只需查找关键字b:v:y

def flatten(mydict):
  new_dict = {}
  for key,value in mydict.items():
    if type(value) == dict:
      _dict = {':'.join([key, _key]):_value for _key, _value in flatten(value).items()}
      new_dict.update(_dict)
    else:
      new_dict[key]=value
  return new_dict

dataDict = {
"a":{
    "r": 1,
    "s": 2,
    "t": 3
    },
"b":{
    "u": 1,
    "v": {
        "x": 1,
        "y": 2,
        "z": 3
    },
    "w": 3
    }
}    

flat_dict = flatten(dataDict)
print flat_dict
{'b:w': 3, 'b:u': 1, 'b:v:y': 2, 'b:v:x': 1, 'b:v:z': 3, 'a:r': 1, 'a:s': 2, 'a:t': 3}

这样您只需使用flat_dict['b:v:y']查找项目,即可1

而不是在每次查找时遍历字典,您可以通过展平字典并保存输出来加快速度,以便从冷启动查找意味着加载扁平字典并简单地执行键/值查找没有遍历。

答案 11 :(得分:0)

如果您还希望能够使用包含嵌套列表和dicts的任意json,并且很好地处理无效的查找路径,那么这就是我的解决方案:

from functools import reduce


def get_furthest(s, path):
    '''
    Gets the furthest value along a given key path in a subscriptable structure.

    subscriptable, list -> any
    :param s: the subscriptable structure to examine
    :param path: the lookup path to follow
    :return: a tuple of the value at the furthest valid key, and whether the full path is valid
    '''

    def step_key(acc, key):
        s = acc[0]
        if isinstance(s, str):
            return (s, False)
        try:
            return (s[key], acc[1])
        except LookupError:
            return (s, False)

    return reduce(step_key, path, (s, True))


def get_val(s, path):
    val, successful = get_furthest(s, path)
    if successful:
        return val
    else:
        raise LookupError('Invalid lookup path: {}'.format(path))


def set_val(s, path, value):
    get_val(s, path[:-1])[path[-1]] = value

答案 12 :(得分:0)

很高兴看到这些答案,因为它们具有两种用于设置和获取嵌套属性的静态方法。这些解决方案比使用嵌套树https://gist.github.com/hrldcpr/2012250

更好

这是我的实现方式。

用法

要设置嵌套属性调用sattr(my_dict, 1, 2, 3, 5) is equal to my_dict[1][2][3][4]=5

要获取嵌套的属性,请调用gattr(my_dict, 1, 2)

def gattr(d, *attrs):
    """
    This method receives a dict and list of attributes to return the innermost value of the give dict       
    """
    try:
        for at in attrs:
            d = d[at]
        return d
    except(KeyError, TypeError):
        return None


def sattr(d, *attrs):
    """
    Adds "val" to dict in the hierarchy mentioned via *attrs
    For ex:
    sattr(animals, "cat", "leg","fingers", 4) is equivalent to animals["cat"]["leg"]["fingers"]=4
    This method creates necessary objects until it reaches the final depth
    This behaviour is also known as autovivification and plenty of implementation are around
    This implementation addresses the corner case of replacing existing primitives
    https://gist.github.com/hrldcpr/2012250#gistcomment-1779319
    """
    for attr in attrs[:-2]:
        if type(d.get(attr)) is not dict:
            d[attr] = {}
        d = d[attr]
    d[attrs[-2]] = attrs[-1]

答案 13 :(得分:0)

通过递归来解决这个问题:

def get(d,l):
    if len(l)==1: return d[l[0]]
    return get(d[l[0]],l[1:])

使用您的示例:

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}
maplist1 = ["a", "r"]
maplist2 = ["b", "v", "y"]
print(get(dataDict, maplist1)) # 1
print(get(dataDict, maplist2)) # 2

答案 14 :(得分:0)

连接字符串的方法:

def get_sub_object_from_path(dict_name, map_list):
    for i in map_list:
        _string = "['%s']" % i
        dict_name += _string
    value = eval(dict_name)
    return value
#Sample:
_dict = {'new': 'person', 'time': {'for': 'one'}}
map_list = ['time', 'for']
print get_sub_object_from_path("_dict",map_list)
#Output:
#one

答案 15 :(得分:0)

扩展@DomTomCat和其他方法,这些功能(即通过Deepcopy返回修改的数据而不影响输入)的setter和mapper可用于嵌套的dictlist

设置者:

def set_at_path(data0, keys, value):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(set_by_path(v,keys[1:],value) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [set_by_path(x[1],keys[1:],value) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=value
        return data

映射器:

def map_at_path(data0, keys, f):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(map_at_path(v,keys[1:],f) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [map_at_path(x[1],keys[1:],f) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=f(data[keys[-1]])
        return data

答案 16 :(得分:0)

您可以使用pydash:

import pydash as _

_.get(dataDict, ["b", "v", "y"], default='Default')

https://pydash.readthedocs.io/en/latest/api.html

答案 17 :(得分:0)

我用这个

def get_dictionary_value(dictionary_temp, variable_dictionary_keys):
     try:
          if(len(variable_dictionary_keys) == 0):
               return str(dictionary_temp)

          variable_dictionary_key = variable_dictionary_keys[0]
          variable_dictionary_keys.remove(variable_dictionary_key)

          return get_dictionary_value(dictionary_temp[variable_dictionary_key] , variable_dictionary_keys)

     except Exception as variable_exception:
          logging.error(variable_exception)
 
          return ''

答案 18 :(得分:-1)

您可以在python中使用eval函数。

def nested_parse(nest, map_list):
    nestq = "nest['" + "']['".join(map_list) + "']"
    return eval(nestq, {'__builtins__':None}, {'nest':nest})

说明

对于您的示例查询:maplist = ["b", "v", "y"]

nestq将是"nest['b']['v']['y']",其中nest是嵌套字典。

eval内置函数执行给定的字符串。但是,重要的是要小心使用eval函数可能引起的漏洞。讨论可以在这里找到:

  1. https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
  2. https://www.journaldev.com/22504/python-eval-function

nested_parse()函数中,我确保没有__builtins__全局变量,而只有可用的局部变量是nest字典。