将字典拼合为列表字典的嵌套字典

时间:2020-02-29 05:04:49

标签: python json dictionary

所以我似乎无法弄清楚如何有效地实现这一点。我正在基于特定的键作为输入,将扁平化的字典嵌套到列表的字典中。如此拼命地学习

鉴于我的数据如下所示:

data= [
  {
    "player": "Kevin Durant",
    "team": "Thunder",
    "location": "Oklahoma City",
    "points": 15

  },
  {
    "player": "Jeremy Lin",
    "team": "Lakers",
    "location": "Los Angeles",
    "points": 22
  },
  {
    "player": "Kobe Bryant",
    "team": "Lakers",
    "location": "Los Angeles",
    "points": 51
  },
  {
    "player": "Blake Griffin",
    "team": "Clippers",
    "location": "Los Angeles",
    "points": 26
  }
]

例如,如果我给它提供reorder(data,['location','team','player'])这样的参数,我想返回类似的内容

result={
  "Los Angeles": {
    "Clippers": {
      "Blake Griffin": [
        {
          "points": 26
        }
      ]
    },
    "Lakers": {
      "Kobe Bryant": [
        {
          "points": 51
        }
      ],
      "Jeremy Lin": [
        {
          "points": 22
        }
      ]
    }
  },
  "Oklahoma City": {
    "Thunder": {
      "Kevin Durant": [
        {
          "points": 15
        }
      ]
    }
  }, 
}

1 个答案:

答案 0 :(得分:3)

您可以使用setdefault函数在遍历数据时自动构建嵌套级别:

data= [
  {
    "player": "Kevin Durant",
    "team": "Thunder",
    "location": "Oklahoma City",
    "points": 15

  },
  {
    "player": "Jeremy Lin",
    "team": "Lakers",
    "location": "Los Angeles",
    "points": 22
  },
  {
    "player": "Kobe Bryant",
    "team": "Lakers",
    "location": "Los Angeles",
    "points": 51
  },
  {
    "player": "Blake Griffin",
    "team": "Clippers",
    "location": "Los Angeles",
    "points": 26
  }
]

nested = dict()
for d in data:
    nested.setdefault(d["location"],dict()) \
          .setdefault(d["team"],    dict()) \
          .setdefault(d["player"],  list()) \
          .append({"points":d["points"]})

输出:

print(nested)

{  'Oklahoma City': 
    {  
       'Thunder': 
           {  'Kevin Durant': [{'points': 15}] }
    }, 
    'Los Angeles': 
    { 
       'Lakers': 
           {  
              'Jeremy Lin': [{'points': 22}], 
              'Kobe Bryant': [{'points': 51}]
           }, 
       'Clippers': 
           {  'Blake Griffin': [{'points': 26}] }
     }
  }

[EDIT]概括该方法

如果您必须经常在不同类型的字典或层次结构上执行此类操作,则可以将其归纳为一个函数:

def dictNesting(data,*levels):
    result = dict()
    for d in data:
        r = result
        for level in levels[:-1]:
            r = r.setdefault(d[level],dict())
        r = r.setdefault(d[levels[-1]],list())
        r.append({k:v for k,v in d.items() if k not in levels})
    return result

然后您将为该函数提供字典列表,后跟要嵌套的键的名称:

byLocation = dictNesting(data,"location","team")

{  'Oklahoma City':
       {  'Thunder': [
              {'player': 'Kevin Durant', 'points': 15}]
       },
   'Los Angeles':
       {'Lakers': [
              {'player': 'Jeremy Lin', 'points': 22},
              {'player': 'Kobe Bryant', 'points': 51}],
        'Clippers': [
              {'player': 'Blake Griffin', 'points': 26}]
       }
}

如果您想以不同的方式对相同数据进行分组,则只需更改字段名称的顺序:

byPlayer = dictNesting(data,"player","location","team")


{  'Kevin Durant':
       {  'Oklahoma City':
              {  'Thunder': [{'points': 15}] }
       },
   'Jeremy Lin':
       {  'Los Angeles':
              {'Lakers': [{'points': 22}]}
       },
   'Kobe Bryant':
       {  'Los Angeles':
              {'Lakers': [{'points': 51}]}
       },
   'Blake Griffin':
       {  'Los Angeles':
              {'Clippers': [{'points': 26}]}
       }
}

从那里您可以使用该功能,并对其进行改进以在最低嵌套级别聚合数据:

def dictNesting(data,*levels,aggregate=False):
    result = dict()
    for d in data:
        r = result
        for level in levels[:-1]:
            r = r.setdefault(d[level],dict())
        r = r.setdefault(d[levels[-1]],[list,dict][aggregate]())
        content = ( (k,v) for k,v in d.items() if k not in levels)
        if aggregate:
            for k,v in content: r.setdefault(k,list()).append(v)
        else:
            r.append(dict(content))
    return result

输出:

byCity = dictNesting(data,"location","team",aggregate=True)

{  'Oklahoma City':
        {'Thunder':
             {'player': ['Kevin Durant'], 'points': [15]}},
   'Los Angeles':
        {'Lakers':
             {'player': ['Jeremy Lin', 'Kobe Bryant'], 'points': [22, 51]},
         'Clippers':
             {'player': ['Blake Griffin'], 'points': [26]}
        }
}

lakersPlayers = byCity["Los Angeles"]["Lakers"]["player"] 
# ['Jeremy Lin', 'Kobe Bryant']

lakersPoints  = sum(byCity["Los Angeles"]["Lakers"]["points"]) 
# 73