将Python词典列表分组

时间:2016-06-06 22:10:24

标签: python list dictionary

我有一些来自API的JSON数据作为字典列表,例如:

entities = [
    {'name': 'McDonalds', 'city': 'New York', 'gross': 250000000, 'id': '000001'},
    {'name': 'McDonalds', 'city': 'Philadelphia', 'gross': 190000000, 'id': '000002'},
    {'name': 'Shake Shack', 'city': 'Los Angeles', 'gross': 17000000, 'id': '000003'},
    {'name': 'In-N-Out Burger', 'city': 'Houston', 'gross': 23000000, 'id': '000004'},
    {'name': 'In-N-Out Burger', 'city': 'Atlanta', 'gross': 12000000, 'id': '000005'},
    {'name': 'In-N-Out Burger', 'city': 'Dallas', 'gross': 950000, 'id': '000006'},
]

我试图将所有具有相同名称的条目分组到另一个以任何业务命名的字典列表中。

def group_entities(entities):

    entity_groups = []

    # Establish a blank list for each unique name
    for entity in entities:
        entity['name'] = []
        entity_groups.append(entity['name'])

    # Within each business's list, add separate dictionaries with details
    for entity in entities:
        entity['name'].append({
            'name':entity['name'],
            'city':entity['city'],
            'gross':entity['gross'],
            'id':entity['id']
            })

    entity_groups.extend(entity['name'])

    return entity_groups

我不能使用entity['name']作为变量名,因为它只是更改原始值,也不能使用名称的字符串版本。我想最终得到我可以迭代并显示的数据:

Business
  • All City 1 Dictionary Values 
  • All City 2 Dictionary Values, etc
Business
  • All City 1 Dictionary Values 
  • All City 2 Dictionary Values, etc

我不知道如何对此进行进一步的研究,因为我不知道正确的“googleable”'描述我想要做的事情的术语。

3 个答案:

答案 0 :(得分:3)

如果您的数据按名称排序:

from itertools import groupby
from operator import itemgetter

entities = [
    {'name': 'McDonalds', 'city': 'New York', 'gross': 250000000, 'id': '000001'},
    {'name': 'McDonalds', 'city': 'Philadelphia', 'gross': 190000000, 'id': '000002'},
    {'name': 'Shake Shack', 'city': 'Los Angeles', 'gross': 17000000, 'id': '000003'},
    {'name': 'In-N-Out Burger', 'city': 'Houston', 'gross': 23000000, 'id': '000004'},
    {'name': 'In-N-Out Burger', 'city': 'Atlanta', 'gross': 12000000, 'id': '000005'},
    {'name': 'In-N-Out Burger', 'city': 'Dallas', 'gross': 950000, 'id': '000006'},
]
data =  [{k: list(v)}  for k, v in groupby(entities, itemgetter("name"))]

哪会给你:

[{'McDonalds': [{'id': '000001', 'city': 'New York', 'name': 'McDonalds', 'gross': 250000000}, {'id': '000002', 'city': 'Philadelphia', 'name': 'McDonalds', 'gross': 190000000}]}, {'Shake Shack': [{'id': '000003', 'city': 'Los Angeles', 'name': 'Shake Shack', 'gross': 17000000}]}, {'In-N-Out Burger': [{'id': '000004', 'city': 'Houston', 'name': 'In-N-Out Burger', 'gross': 23000000}, {'id': '000005', 'city': 'Atlanta', 'name': 'In-N-Out Burger', 'gross': 12000000}, {'id': '000006', 'city': 'Dallas', 'name': 'In-N-Out Burger', 'gross': 950000}]}]

或者,如果您不想要这个名字:

 keys = ("id","gross", "city")

 data = [{k: [dict(zip(keys, itemgetter(*keys)(dct))) for dct in v]}  for k, v in groupby(entities, itemgetter("name"))]

如果未订购数据,您可以使用 defaultdict

from collections import defaultdict

d = defaultdict(list)

for entity in entities:
    d[entity["name"]].append(dict(entity))
print([{k: v} for k,v in d.items()])

你再一次取消了这个名字,或者你想使用原始的dicts,你不要介意改变它们:

from collections import defaultdict

d = defaultdict(list)

for entity in entities:
    d[entity.pop("name")].append(entity)
print([{k: v} for k,v in d.items()])

那会给你:

[{'Shake Shack': [{'id': '000003', 'city': 'Los Angeles', 'gross': 17000000}]}, {'McDonalds': [{'id': '000001', 'city': 'New York', 'gross': 250000000}, {'id': '000002', 'city': 'Philadelphia', 'gross': 190000000}]}, {'In-N-Out Burger': [{'id': '000004', 'city': 'Houston', 'gross': 23000000}, {'id': '000005', 'city': 'Atlanta', 'gross': 12000000}, {'id': '000006', 'city': 'Dallas', 'gross': 950000}]}]

这一切都取决于你是否想再次使用原始的dicts和/或你是否希望将名字保存在dicts中。您可以组合逻辑的各个部分以获得您喜欢的任何格式。

答案 1 :(得分:1)

这应该有效:

def group_entities(entities):

    entity_groups = {}

    # Within each business's list, add separate dictionaries with details
    for entity in entities:
        name = entity['name']   # name is the key for entity_groups
        del entity['name']      # remove it from each entity
        # add the entity to the entity_groups with the key (name)
        entity_groups[name] = entity_groups.get(name, []) + [entity]

    return entity_groups

如果要在每个实体中保留实体名称,请删除del语句。

答案 2 :(得分:1)

bycompany = {}
for ent in entities:
    if not ent['name'] in bycompany:
        # if there is no location list for this company name,
        # then start a new list for this company.
        bycompany[ent['name']] = []

    # Add the dict to the list of locations for this company.
    bycompany[ent['name']].append(ent)