从拼合字典创建嵌套字典

时间:2018-05-30 14:22:22

标签: python dictionary recursion nested netcdf

我有一个扁平的字典,我想把它变成一个嵌套的字典

flat = {'X_a_one': 10,
        'X_a_two': 20, 
        'X_b_one': 10,
        'X_b_two': 20, 
        'Y_a_one': 10,
        'Y_a_two': 20,
        'Y_b_one': 10,
        'Y_b_two': 20}

我想将其转换为

形式
nested = {'X': {'a': {'one': 10,
                      'two': 20}, 
                'b': {'one': 10,
                      'two': 20}}, 
          'Y': {'a': {'one': 10,
                      'two': 20},
                'b': {'one': 10,
                      'two': 20}}}

扁平字典的结构使得模糊不应该存在任何问题。我希望它适用于任意深度的字典,但性能并不是真正的问题。我已经看到很多用于展平嵌套字典的方法,但基本上没有用于嵌套扁平字典的方法。存储在字典中的值是标量或字符串,永远不会迭代。

到目前为止,我有一些可以接受输入的东西

test_dict = {'X_a_one': '10',
             'X_b_one': '10',
             'X_c_one': '10'}

到输出

test_out = {'X': {'a_one': '10', 
                  'b_one': '10', 
                  'c_one': '10'}}

使用代码

def nest_once(inp_dict):
    out = {}
    if isinstance(inp_dict, dict):
        for key, val in inp_dict.items():
            if '_' in key:
                head, tail = key.split('_', 1)

                if head not in out.keys():
                    out[head] = {tail: val}
                else:
                    out[head].update({tail: val})
            else:
                out[key] = val
    return out

test_out = nest_once(test_dict)

但是我在弄清楚如何将它变成递归创建字典的所有级别的东西时遇到了麻烦。

任何帮助将不胜感激!

(至于为什么我要这样做:我有一个文件,其结构相当于嵌套的dict,我想将这个文件的内容存储在NetCDF文件的属性字典中,稍后检索它但是NetCDF只允许你将平面词典作为属性,所以我想取消先前存储在NetCDF文件中的字典。)

7 个答案:

答案 0 :(得分:24)

output = {}

for k, v in source.items():
    # always start at the root.
    current = output

    # This is the part you're struggling with.
    pieces = k.split('_')

    # iterate from the beginning until the second to last place
    for piece in pieces[:-1]:
       if not piece in current:
          # if a dict doesn't exist at an index, then create one
          current[piece] = {}

       # as you walk into the structure, update your current location
       current = current[piece]

    # The reason you're using the second to last is because the last place
    # represents the place you're actually storing the item
    current[pieces[-1]] = v

答案 1 :(得分:24)

这是我的看法:

def nest_dict(flat):
    result = {}
    for k, v in flat.items():
        _nest_dict_rec(k, v, result)
    return result

def _nest_dict_rec(k, v, out):
    k, *rest = k.split('_', 1)
    if rest:
        _nest_dict_rec(rest[0], v, out.setdefault(k, {}))
    else:
        out[k] = v

flat = {'X_a_one': 10,
        'X_a_two': 20, 
        'X_b_one': 10,
        'X_b_two': 20, 
        'Y_a_one': 10,
        'Y_a_two': 20,
        'Y_b_one': 10,
        'Y_b_two': 20}
nested = {'X': {'a': {'one': 10,
                      'two': 20}, 
                'b': {'one': 10,
                      'two': 20}}, 
          'Y': {'a': {'one': 10,
                      'two': 20},
                'b': {'one': 10,
                      'two': 20}}}
print(nest_dict(flat) == nested)
# True

答案 2 :(得分:13)

这是使用collections.defaultdict的一种方式,大量借鉴this previous answer。有3个步骤:

  1. 创建嵌套defaultdictdefaultdict个对象。
  2. flat输入词典中迭代项目。
  3. 根据defaultdict分割键派生的结构构建_结果,使用getFromDict迭代结果字典。
  4. 这是一个完整的例子:

    from collections import defaultdict
    from functools import reduce
    from operator import getitem
    
    def getFromDict(dataDict, mapList):
        """Iterate nested dictionary"""
        return reduce(getitem, mapList, dataDict)
    
    # instantiate nested defaultdict of defaultdicts
    tree = lambda: defaultdict(tree)
    d = tree()
    
    # iterate input dictionary
    for k, v in flat.items():
        *keys, final_key = k.split('_')
        getFromDict(d, keys)[final_key] = v
    
    {'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
     'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}
    

    最后一步,您可以将defaultdict转换为常规dict,但通常不需要此步骤。

    def default_to_regular_dict(d):
        """Convert nested defaultdict to regular dict of dicts."""
        if isinstance(d, defaultdict):
            d = {k: default_to_regular_dict(v) for k, v in d.items()}
        return d
    
    # convert back to regular dict
    res = default_to_regular_dict(d)
    

答案 3 :(得分:4)

其他答案更清晰,但既然你提到了递归,我们还有其他选择。

def nest(d):
    _ = {}
    for k in d:
        i = k.find('_')
        if i == -1:
            _[k] = d[k]
            continue
        s, t = k[:i], k[i+1:]
        if s in _:
            _[s][t] = d[k]
        else:
            _[s] = {t:d[k]}
    return {k:(nest(_[k]) if type(_[k])==type(d) else _[k]) for k in _}

答案 4 :(得分:4)

您可以使用import android.net.http.SslError; import android.support.v7.app.AppCompatActivity; import android.os.Bundle; import android.webkit.SslErrorHandler; import android.webkit.WebChromeClient; import android.webkit.WebSettings; import android.webkit.WebView; import android.webkit.WebViewClient; import android.graphics.Bitmap; import android.view.View; import android.widget.ProgressBar; public class MainActivity extends AppCompatActivity { String ShowOrHideWebViewInitialUse = "show"; private WebView myWebView; private ProgressBar spinner; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); myWebView = (WebView) findViewById(R.id.webView); spinner = (ProgressBar)findViewById(R.id.progressBar1); myWebView.setWebViewClient(new CustomWebViewClient()); WebSettings webSettings = myWebView.getSettings(); webSettings.setJavaScriptEnabled(true); myWebView.getSettings().setDomStorageEnabled(true); myWebView.getSettings().setDatabaseEnabled(true); myWebView.getSettings().setMinimumFontSize(1); myWebView.getSettings().setMinimumLogicalFontSize(1); myWebView.setWebChromeClient(new WebChromeClient()); myWebView.loadUrl("https://node-red-test.ftp21.net:1024/ui"); myWebView.setWebViewClient(new WebViewClient() { @Override public void onReceivedSslError(WebView view, SslErrorHandler handler, SslError error){ handler.proceed(); } }); } // This allows for a splash screen // (and hide elements once the page loads) private class CustomWebViewClient extends WebViewClient { @Override public void onPageStarted(WebView myWebView, String url, Bitmap favicon) { // only make it invisible the FIRST time the app is run if (ShowOrHideWebViewInitialUse.equals("show")) { myWebView.setVisibility(myWebView.INVISIBLE); } } @Override public void onPageFinished(WebView view, String url) { ShowOrHideWebViewInitialUse = "hide"; spinner.setVisibility(View.GONE); view.setVisibility(myWebView.VISIBLE); super.onPageFinished(view, url); } } @Override public void onBackPressed() { if (myWebView.canGoBack()) { myWebView.goBack(); } else { super.onBackPressed(); } } }

itertools.groupby

输出:

import itertools, json
flat = {'Y_a_two': 20, 'Y_a_one': 10, 'X_b_two': 20, 'X_b_one': 10, 'X_a_one': 10, 'X_a_two': 20, 'Y_b_two': 20, 'Y_b_one': 10}
_flat = [[*a.split('_'), b] for a, b in flat.items()]
def create_dict(d): 
  _d = {a:list(b) for a, b in itertools.groupby(sorted(d, key=lambda x:x[0]), key=lambda x:x[0])}
  return {a:create_dict([i[1:] for i in b]) if len(b) > 1 else b[0][-1] for a, b in _d.items()}

print(json.dumps(create_dict(_flat), indent=3))

答案 5 :(得分:4)

另一种没有导入的非递归解决方案。在插入平面字典的每个键值对和映射平面字典的键值对之间拆分逻辑。

def insert(dct, lst):
    """
    dct: a dict to be modified inplace.
    lst: list of elements representing a hierarchy of keys
    followed by a value.

    dct = {}
    lst = [1, 2, 3]

    resulting value of dct: {1: {2: 3}}
    """
    for x in lst[:-2]:
        dct[x] = dct = dct.get(x, dict())

    dct.update({lst[-2]: lst[-1]})


def unflat(dct):
    # empty dict to store the result
    result = dict()

    # create an iterator of lists representing hierarchical indices followed by the value
    lsts = ([*k.split("_"), v] for k, v in dct.items())

    # insert each list into the result
    for lst in lsts:
        insert(result, lst)

    return result


result = unflat(flat)
# {'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
# 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}

答案 6 :(得分:1)

这是一个合理可读的递归结果:

def unflatten_dict(a, result=None, sep='_'):

    if result is None:
        result = dict()

    for k, v in a.items():
        k, *rest = k.split(sep, 1)
        if rest:
            unflatten_dict({rest[0]: v}, result.setdefault(k, {}), sep=sep)
        else:
            result[k] = v

    return result


flat = {'X_a_one': 10,
        'X_a_two': 20,
        'X_b_one': 10,
        'X_b_two': 20,
        'Y_a_one': 10,
        'Y_a_two': 20,
        'Y_b_one': 10,
        'Y_b_two': 20}

print(unflatten_dict(flat))
{'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}, 
 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}

这是基于以上几个答案,不使用任何导入,仅在python 3中进行了测试。