我有以下熊猫数据框-
1. A John
2. A Juliet
3. A Joseph
4. A Romeo
5. A Chris
6. A John
7. A Juliet
8. A Joseph
9. A Romeo
10. A Chris
11. A John
12. B Juliet
13. B Joseph
14. B Romeo
15. B Chris
16. B John
17. C Juliet
18. C Joseph
19. C Romeo
我必须使用2个条件进行过滤:
My logic filters for each employee being there 3 times --
unique_employee=df.loc[:,"Employee"].unique().tolist()
count=0
for i in unique_employee:
if count==0:
df2=df1[df1['Employee']==i].iloc[0:3,:]
count+=1
else:
df2=pd.concat([df2,df1[df1['Employee']==i].iloc[0:3,:]])
How do i put in the second part of my condition too?
我的预期输出将是公司A将有8个实例,公司B将有4个实例,公司C将有3个实例,并教员工在那儿三次–
A John
A John
B John
A Joseph
A Joseph
C Joseph
A Chris
A Chris
B Chris
A Juliet
B Juliet
C Juliet
A Romeo
B Romeo
C Romeo
答案 0 :(得分:0)
在以下代码中,存储“ A”,“ B”,“ C”的列的名称被命名为“ abc”。
每个ep是一个由“雇员”列标识的组。使用counter将返回一个计数器对象(类似于字典),指示员工拥有多少abc。
然后将计数器与condition_dic(预先设置)进行比较。如果满足要求,它将被添加到输出列表中。
from collections import Counter
employees=df.groupby("Employee")
condition_dic = {'A': 8, 'B':4, 'C':3}
output = []
for ep in employees:
if len(ep[1]) == 3:
output.append(ep[1])
else:
cnt = Counter(ep[1]['abc'])
if cnt == condition_dic:
output.append(ep[1])
output = pd.concat(output)