Sapply有两个条件

时间:2017-04-26 18:49:18

标签: arrays r

我有两张桌子。其中有如下所示的格式。其中一个是表A:

students|Test Score|Year
A       |  100     |1993
B       |   81     |1992
C       |   92     |1992
D       |   88     |1993

另一张表B我看起来像这样:

Class | Students | Year
1     | {A,D}    |1993
2     | {B,C}    |1992

我想在R中执行某种操作,我可以在表A中的表B列中搜索数组中列出的学生,并将分数制成以下格式:

Class | Students | Mean Score
1     | {A,D}    |   94
2     | {B,C}    |   86.5

是否有任何公式可用于搜索,然后通过R中的某些操作合并这些结果?

我知道上面的事情可以用:

B$MeanScore <- sapply(strsplit(gsub("[{}]","", B$Students), split=","),
   function(x) mean(A$Test.Score[A$Students %in% x]))

但是我有办法添加第二个条件来匹配年份。课程年份和考试年份。

1 个答案:

答案 0 :(得分:0)

在这里与jogo达成完全一致:

A <- data.frame(students = c("A","B","C","D"), `Test Score` = c(100,81,92,88), Year = c(1993,1992,1992,1993))
A
#  students Test.Score Year
#1        A        100 1993
#2        B         81 1992
#3        C         92 1992
#4        D         88 1993

B <- data.frame(Class = c(1,2), Students = c("{A,D}","{B,C}"), Year = c(1993,1992))
B
#  Class Students Year
#1     1    {A,D} 1993
#2     2    {B,C} 1992

colnames(A) # taking note of the case sensitive "students" and "Year"
#[1] "students"   "Test.Score" "Year"   

s <- strsplit(gsub("[{}]","",B$Students), ",")
B.long <- data.frame(students = unlist(s), 
                     Class = rep(B$Class, sapply(s, length)), 
                     Year = rep(B$Year, sapply(s, length)))
B.long
#Students Class Year
#1        A     1 1993
#2        D     1 1993
#3        B     2 1992
#4        C     2 1992

Newdf <- merge.data.frame(A, B.long, c("Year","students"))
#Year students Test.Score Class
#1 1992        B         81     2
#2 1992        C         92     2
#3 1993        A        100     1
#4 1993        D         88     1

aggregate(Test.Score ~ Year + Class, Newdf, mean)
#Year Class Test.Score
#1 1993     1       94.0
#2 1992     2       86.5