我有一个清单,列出了一天中不同时间工作的员工,我想计算每个人工作的天数,例如:
脚:3 BAZ:3 NOM:1个等等。
这是我接收原始数据的方式:
my_list = [('NOM', datetime.date(2030, 1, 1)),
('BAR', datetime.date(2019, 4, 8)),
('HAM', datetime.date(2019, 4, 8)),
('FOO', datetime.date(2019, 4, 8)),
('BAZ', datetime.date(2019, 4, 8)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11))]
我设法将每个人的清单剥离为唯一的日期,如下所示:
a = Counter(set(dictio))
每天消除一个人的重复:
Counter({('HAM', datetime.date(2019, 4, 8)): 1,
('HAM', datetime.date(2019, 4, 10)): 1,
('HAM', datetime.date(2019, 4, 11)): 1,
('BAR', datetime.date(2019, 4, 8)): 1,
('BAR', datetime.date(2019, 4, 10)): 1,
('BAR', datetime.date(2019, 4, 11)): 1,
('FOO', datetime.date(2019, 4, 8)): 1,
('FOO', datetime.date(2019, 4, 10)): 1,
('FOO', datetime.date(2019, 4, 11)): 1,
('BAZ', datetime.date(2019, 4, 8)): 1,
('BAZ', datetime.date(2019, 4, 10)): 1,
('BAZ', datetime.date(2019, 4, 11)): 1,
('NOM', datetime.date(2030, 1, 1)): 1})
这就是我遇到的问题:我要从上面转到:
HAM:3
BAR:3
FOO:3
BAZ:3
NOM:1
答案 0 :(得分:6)
您可以使用
import collections
collections.Counter(x for x , y in set(my_list) )
Out[251]: Counter({'BAR': 3, 'BAZ': 3, 'FOO': 3, 'HAM': 3, 'NOM': 1})
答案 1 :(得分:3)
使用itertools.groupby
:
from itertools import groupby
from operator import itemgetter
result = {key: len(group) for key, group in groupby(sorted(set(my_list)), key=itemgetter(0))}
print(result)
这会按第一个元素(名称)对my_list
进行排序,根据这些名称将其划分为组,最后得到每个组的名称和长度作为{{1}中的键值对}。
输出:
dict
答案 2 :(得分:1)
将列表转换为熊猫数据框,删除重复项并按名称分组
import datetime
import pandas as pd
my_list = [('NOM', datetime.date(2030, 1, 1)),
('BAR', datetime.date(2019, 4, 8)),
('HAM', datetime.date(2019, 4, 8)),
('FOO', datetime.date(2019, 4, 8)),
('BAZ', datetime.date(2019, 4, 8)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11))]
# COnvert List of Tuples to Dataframe
df = pd.DataFrame(my_list,columns=['name','date'])
#Remove Duplicates
df.drop_duplicates(inplace=True)
#Group by Name Count
df.groupby('name').count()