我有两张桌子。其中有如下所示的格式。其中一个是表A:
students|Test Score|Year
A | 100 |1993
B | 81 |1992
C | 92 |1992
D | 88 |1993
另一张表B我看起来像这样:
Class | Students | Year
1 | {A,D} |1993
2 | {B,C} |1992
我想在R中执行某种操作,我可以在表A中的表B列中搜索数组中列出的学生,并将分数制成以下格式:
Class | Students | Mean Score
1 | {A,D} | 94
2 | {B,C} | 86.5
是否有任何公式可用于搜索,然后通过R中的某些操作合并这些结果?
我知道上面的事情可以用:
B$MeanScore <- sapply(strsplit(gsub("[{}]","", B$Students), split=","),
function(x) mean(A$Test.Score[A$Students %in% x]))
但是我有办法添加第二个条件来匹配年份。课程年份和考试年份。
答案 0 :(得分:0)
在这里与jogo达成完全一致:
A <- data.frame(students = c("A","B","C","D"), `Test Score` = c(100,81,92,88), Year = c(1993,1992,1992,1993))
A
# students Test.Score Year
#1 A 100 1993
#2 B 81 1992
#3 C 92 1992
#4 D 88 1993
B <- data.frame(Class = c(1,2), Students = c("{A,D}","{B,C}"), Year = c(1993,1992))
B
# Class Students Year
#1 1 {A,D} 1993
#2 2 {B,C} 1992
colnames(A) # taking note of the case sensitive "students" and "Year"
#[1] "students" "Test.Score" "Year"
s <- strsplit(gsub("[{}]","",B$Students), ",")
B.long <- data.frame(students = unlist(s),
Class = rep(B$Class, sapply(s, length)),
Year = rep(B$Year, sapply(s, length)))
B.long
#Students Class Year
#1 A 1 1993
#2 D 1 1993
#3 B 2 1992
#4 C 2 1992
Newdf <- merge.data.frame(A, B.long, c("Year","students"))
#Year students Test.Score Class
#1 1992 B 81 2
#2 1992 C 92 2
#3 1993 A 100 1
#4 1993 D 88 1
aggregate(Test.Score ~ Year + Class, Newdf, mean)
#Year Class Test.Score
#1 1993 1 94.0
#2 1992 2 86.5