根据https://jsonplaceholder.typicode.com/todos的数据,我想按用户计数“已完成”的项目。
当前,我首先收集现有的用户ID密钥,然后对数据集中的每个元素检查其是否归当前用户所有,并追加到该用户的项目列表中。
users_items = {}
import json
from urllib import request
# Data from
uri = "https://jsonplaceholder.typicode.com/todos"
response = request.urlopen(uri).read()
data = json.loads(response)
def get_user_ids(items):
for item in items:
users_items[item['userId']] = None
def get_user_items():
for uid in users_items:
items = []
for item in data:
if(item['userId'] == uid):
items.append(item['completed'])
users_items[uid] = items
done_items_by_user = {}
def count_completed_by_user():
for user in users_items:
done_items_by_user[user] = sum(users_items[user])
get_user_ids(data)
get_user_items()
我特别不喜欢双循环和get_users_ids
中带有空列表的字典值的初始化。
答案 0 :(得分:3)
仅带有defaultdict对象:
import json
from urllib import request
from collections import defaultdict
# Data from
uri = "https://jsonplaceholder.typicode.com/todos"
response = request.urlopen(uri).read()
data = json.loads(response)
def count_user_completed_items(data):
result = defaultdict(int)
for item in data:
if item['completed']: result[item['userId']] += 1
return dict(result)
print(count_user_completed_items(data))
输出(键为“用户ID” ,值是“完成” 个项目):
{1: 11, 2: 8, 3: 7, 4: 6, 5: 12, 6: 6, 7: 9, 8: 11, 9: 8, 10: 12}
答案 1 :(得分:0)
您可以使用dict方法get()
插入/更新用户ID:
done_items_by_user = dict()
for item in data:
done_items_by_user[item['userId']] = done_items_by_user.get(item['userId'], 0) + item['completed']
答案 2 :(得分:0)
流行的pandas
库允许您一行执行此操作:
import pandas as pd
complete_items_per_user = pd.DataFrame(data).groupby('userId')['completed'].sum()
如果您要问在没有pandas
的情况下可以做什么,则可以通过dict理解来避免显式循环:
users = set(x['userId'] for x in data)
complete_items_per_user = {user: sum(x['completed'] for x in data if x['userId']==user) for user in users}