如何避免Python中的双循环?

时间:2019-06-05 11:28:11

标签: python json

根据https://jsonplaceholder.typicode.com/todos的数据,我想按用户计数“已完成”的项目。

当前,我首先收集现有的用户ID密钥,然后对数据集中的每个元素检查其是否归当前用户所有,并追加到该用户的项目列表中。

users_items = {}

import json
from urllib import request

# Data from
uri = "https://jsonplaceholder.typicode.com/todos"

response = request.urlopen(uri).read()
data = json.loads(response)

def get_user_ids(items):
    for item in items:
        users_items[item['userId']] = None

def get_user_items():
    for uid in users_items:
        items = []
        for item in data:
            if(item['userId'] == uid):
                items.append(item['completed'])
        users_items[uid] = items

done_items_by_user = {}
def count_completed_by_user():
    for user in users_items:
        done_items_by_user[user] = sum(users_items[user])

get_user_ids(data)
get_user_items()

我特别不喜欢双循环和get_users_ids中带有空列表的字典值的初始化。

3 个答案:

答案 0 :(得分:3)

仅带有defaultdict对象:

import json
from urllib import request
from collections import defaultdict

# Data from
uri = "https://jsonplaceholder.typicode.com/todos"

response = request.urlopen(uri).read()
data = json.loads(response)


def count_user_completed_items(data):
    result = defaultdict(int)
    for item in data:
        if item['completed']: result[item['userId']] += 1
    return dict(result)


print(count_user_completed_items(data))

输出(键为“用户ID” ,值是“完成” 个项目):

{1: 11, 2: 8, 3: 7, 4: 6, 5: 12, 6: 6, 7: 9, 8: 11, 9: 8, 10: 12}

答案 1 :(得分:0)

您可以使用dict方法get()插入/更新用户ID:

done_items_by_user = dict()
for item in data:
    done_items_by_user[item['userId']] = done_items_by_user.get(item['userId'], 0) + item['completed']

答案 2 :(得分:0)

流行的pandas库允许您一行执行此操作:

import pandas as pd
complete_items_per_user = pd.DataFrame(data).groupby('userId')['completed'].sum()

如果您要问在没有pandas的情况下可以做什么,则可以通过dict理解来避免显式循环:

users = set(x['userId'] for x in data)
complete_items_per_user = {user: sum(x['completed'] for x in data if x['userId']==user) for user in users}