有很多文件,但是我选择了这3个文件进行排名。我想循环显示单个文件的排名。目的是在参考学生姓名和年份的主题列中找到唯一的值。
我有3个文件:
Jack_2019.csv
StudentName Subject TypeofMoudle Grade Year
Jack Design 2D Modelling 4 2019
Jack Design 3D Modelling 4 2019
Jack Design AD 4 2019
Jack Networking CloudComputing 4 2019
Jack Networking NOS 4 2019
Jack Coding Mobile App 4 2019
Jack_2018.csv
StudentName Subject TypeofMoudle Grade Year
Jack Networking CloudComputing 4 2018
Jack Networking CloudComputing2 4 2018
Jack Design Video Editing 3 2018
Jack Design Photo Editing 4 2018
Jack Coding Web App 4 2018
Mary_2019.csv
StudentName Subject TypeofMoudle Grade Year
Mary Networking CloudComputing 4 2019
Mary Networking NOS1 4 2019
Mary Coding Web App 1 4 2019
所有文件合并后: 清除数据:
StudentName Subject TypeofMoudle Grade Year
Jack Design 2D Modelling 4 2019
Jack Design 3D Modelling 4 2019
Jack Design AD 4 2019
Jack Networking CloudComputing 4 2019
Jack Networking NOS 4 2019
Jack Coding Mobile App 4 2019
Jack Networking CloudComputing 4 2018
Jack Networking CloudComputing2 4 2018
Jack Design Video Editing 3 2018
Jack Design Photo Editing 4 2018
Jack Coding Web App 4 2018
Mary Networking CloudComputing 4 2019
Mary Networking NOS1 4 2019
Mary Coding Web App 1 4 2019
这是所需的参考列:
StudentName Subject Year
Jack Design 2019
Jack Design 2019
Jack Design 2019
Jack Networking 2019
Jack Networking 2019
Jack Coding 2019
Jack Networking 2018
Jack Networking 2018
Jack Design 2018
Jack Design 2018
Jack Coding 2018
Mary Networking 2019
Mary Networking 2019
Mary Coding 2019
这是我想要的结果
StudentName Subject Rank Year
Jack Design 1 2019
Jack Design 1 2019
Jack Design 1 2019
Jack Networking 2 2019
Jack Networking 2 2019
Jack Coding 3 2019
Jack Networking 1 2018
Jack Networking 1 2018
Jack Design 2 2018
Jack Design 2 2018
Jack Coding 3 2018
Mary Networking 1 2019
Mary Networking 1 2019
Mary Coding 2 2019
我尝试过的事情:
df['Rank']=df.groupby(['StudentName','Year'])['Subject'].transform('count')