根据列中列出的最大数量选择R中的观测值

时间:2017-08-29 03:35:25

标签: r dplyr

我希望我能做到这一点!我有两个数据框:

teachers = structure(list(Teacher = c(123L, 123L, 123L, 123L, 124L), 
    tStudents = c(3L, 3L, 4L, 3L, 4L), Term = c(1801L, 1802L, 1801L, 1803L, 1802L), 
    Course = structure(c(5L, 6L, 7L, 6L, 8L), .Label = c("ENGG", 
    "ENGG2", "LITT", "LITT2", "MATH", "MATH2", "PHYS", "SCIE"
    ), class = "factor")), .Names = c("Teacher", "tStudents", "Term", "Course"), row.names = c(NA, 5L), class = "data.frame")
enrols = structure(list(UniqueStudent = structure(c(3L, 2L, 1L, 5L, 4L), 
    .Label = c("1801-ENGG-N1-abcd1@abc.edu.au", "1801-MATH-C1-abcd1@abc.edu.au","1801-PHYS-L1-abcd1@abc.edu.au", "1802-MATH2-G1-abcd1@abc.edu.au", "1802-SCIE-K2-abcd1@abc.edu.au"), class = "factor"), Term = c(1801L,1801L, 1801L, 1802L, 1802L), Student.Email.Addresses = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "abcd1@abc.edu.au", class = "factor"), ID = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "s12344", class = "factor"), 
Gender.Description = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "M", class = "factor"), 
Age = c(12L, 12L, 12L, 12L, 12L), Program.Short.Description = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = "LSC1", class = "factor"), Term.CC.CN = structure(c(3L, 
2L, 1L, 5L, 4L), .Label = c("1801-ENGG-N1", "1801-MATH-C1", 
"1801-PHYS-L1", "1802-MATH2-G1", "1802-SCIE-K2"), class = "factor"), 
Course.Code = structure(c(4L, 2L, 1L, 5L, 3L), .Label = c("ENGG", 
"MATH", "MATH2", "PHYS", "SCIE"), class = "factor"), Class.Number = structure(c(4L, 
1L, 5L, 3L, 2L), .Label = c("C1", "G1", "K2", "L1", "N1"), class = "factor"), 
Teacher = c(123L, 123L, 125L, 124L, 123L)), .Names = c("UniqueStudent", "Term", "Student.Email.Addresses", "ID", "Gender.Description", "Age", "Program.Short.Description", "Term.CC.CN", "Course.Code", "Class.Number", "Teacher"), row.names = c(NA, 5L), class = "data.frame")

teachers$tStudents列出了每个学期和课程允许分配给教师的最大学生人数。我还预先合并了" enrols"中的课程注册。数据列出每门课程的教师。

所以,我需要做的是使用enrols数据创建teachers数据中的班级列表c("teacher", "Term", "Course"),但我的班级列表只能根据学生选择学生的最大值教师名单中列出的数字$ t学生。理想情况下,我还希望选择一个有代表性的学生分布,以便新的课程列表具有不同性别,不同年龄,并且来自不同的Program.Short.Description。

我尝试在dplyr中以不同的方式合并,并且可以与所有学生一起创建完整列表但是还没有能够使用教师$ tStudents列限制要选择的观察数量。这可能吗?

0 个答案:

没有答案