有没有更有效的方法来获得结果(O(n + m)而不是O(n * m))?

时间:2019-03-29 11:04:14

标签: python loops

Origin data如下所示,每个项目都有一个类型标记,例如interests, family, behaviors, etc,我想按此类型字段分组。

return_data = [
{
      "id": "112",
      "name": "name_112",
      "type": "interests",
    },
    {
      "id": "113",
      "name": "name_113",
      "type": "interests",
    },
      {
      "id": "114",
      "name": "name_114",
      "type": "interests",
    },
      {
      "id": "115",
      "name": "name_115",
      "type": "behaviors",
    },
         {
      "id": "116",
      "name": "name_116",
      "type": "family",
    },
         {
      "id": "117",
      "name": "name_117",
      "type": "interests",
    },
    ...
]

expected ouput数据格式,例如:

output_data = [

    {"interests":[
        {
          "id": "112",
          "name": "name_112"
        },
        {
          "id": "113",
          "name": "name_113"
        },
        ...
        ]
    },
    {
        "behaviors": [
            {
                "id": "115",
                "name": "name_115"
            },
            ...
        ]
    },
    {
        "family": [
            {
                "id": "116",
                "name": "name_116"
            },
            ...
        ]
    },
    ...
]

这是我的审判:

type_list = []
for item in return_data:
    if item['type'] not in type_list:
        type_list.append(item['type'])

interests_list = []
for type in type_list:
    temp_list = []
    for item in return_data:
        if item['type'] == type:
            temp_list.append({"id": item['id'], "name": item['name']})
    interests_list.append({type: temp_list})

很明显,我的试验效率很低,因为它是O(n * m),但我找不到解决该问题的更有效方法。

是否有更有效的方法来获得结果?任何评论都非常欢迎,谢谢。

2 个答案:

答案 0 :(得分:2)

使用defaultdict存储每种类型的项目列表:

from collections import defaultdict

# group by type
temp_dict = defaultdict(list)
for item in return_data:
    temp_dict[item["type"]].append({"id": item["id"], "name": item["name"]})

# convert back into a list with the desired format
output_data = [{k: v} for k, v in temp_dict.items()]

输出:

[
    {
        'behaviors': [
            {'name': 'name_115', 'id': '115'}
        ]
    }, 
    {
        'family': [
            {'name': 'name_116', 'id': '116'}
        ]
    }, 
    {
        'interests': [
            {'name': 'name_112', 'id': '112'},
            {'name': 'name_113', 'id': '113'},
            {'name': 'name_114', 'id': '114'},
            {'name': 'name_117', 'id': '117'}
        ]
    },
    ...
]

如果您不想导入defaultdict,则可以将香草词典与setdefault一起使用:

# temp_dict = {}

temp_dict.setdefault(item["type"], []).append(...)

行为完全相同,即使效率有所降低。

答案 1 :(得分:0)

请参阅Python dictionary中的地图。

for item in return_data:
typeMap[item['type']] = typeMap[item['type']]  + delimiter + item['name']