Question

我有一个像这样的数据框

如您所见，学生1和3在特定科目中都取得了很高的分数，但他们的总成绩却很差，而学生2在任何科目中均未获得最高分，但总体得分最高

overallScore = subject111Mark * subject111Weight + subject222Mark * subject222Weight

所以我想看看某个学生是否是“全能学生”，这意味着我想查看该学生的总成绩是否最高，但是在任何学科中都没有最高分数。如果满足此条件，则将该学生标记为“全才”

和df应该看起来像这样：

studentID subjectID subjectMark subjectWeight  Rank   overallScore 

 1         111         100         0.4           3      40      
 1         222         0           0.6           3      40   
 2         111         90          0.4           1      90      
 2         222         90          0.6           1      90     
 3         111         0           0.4           2      60      
 3         222         100         0.6           2      60

我有一个后续问题
给出的答案可以解决最后一个数据帧的问题，但是如果我想对以下数据帧中的每个类都做到这一点呢？

studentID subjectID subjectMark subjectWeight  Rank   overallScore AR

 1         111         100         0.4           3      40         F
 1         222         0           0.6           3      40         F
 2         111         90          0.4           1      90         T
 2         222         90          0.6           1      90         T
 3         111         0           0.4           2      60         F
 3         222         100         0.6           2      60         F

Answer 1

您可以检查

s1=df.groupby('subjectID').subjectMark.transform('max').eq(df.subjectMark)# check the max score with each student 
s2=df.overallScore.eq(df.overallScore.max())# get the max score of overall
s2&((~s1).groupby(df['studentID']).transform('all'))# get the above conditions and both met should return True
Out[1066]: 
0    False
1    False
2     True
3     True
4    False
5    False
dtype: bool

Answer 2

list_of_all_rounder_per_class = []

for classid in data['classID'].unique():
    that_class = data.loc[data.classID == classID]
    condition1 = that_class.groupby(['subjectID']).subjectMark.transform('max').eq(that_class.subjectMark) 
    condition2 = that_class.overallScore.eq(that_class. overallScore.max()) 
    # get the above conditions and both met should return True
    list_of_all_rounder_per_class.append(condition2 &((~condition1).groupby(that_class['studentID']).transform('all')))

total_result = [result_for_each_class.to_frame('all_rounder') for result_for_each_class in list_of_all_rounder_per_class]
all_rounder = pd.concat(total_result)

data = data.join(all_rounder, how='outer')

我想出了一种解决方法，即使这可能是实现目标的最佳（最简洁）方式

如何计算熊猫的全能

2 个答案: