我有一些学生考试点数据:
MAPPING PupilMatchingRefAnonymous POINTS
1 PHYS 1 60
2 COMP 1 40
3 ENGL 1 20
4 MATH 1 80
我希望将每个学生的数学和英语成绩添加到每个考试中,以便于比较:
MAPPING PupilMatchingRefAnonymous POINTS MATH ENGL
1 PHYS 1 60 80 20
2 COMP 1 40 80 20
3 ENGL 1 20 80 20
4 MATH 1 80 80 20
我已尝试过以下代码,但没有运气:
comResults %>%
select(MAPPING, PupilMatchingRefAnonymous, POINTS) %>%
group_by(PupilMatchingRefAnonymous) %>%
mutate(MATH=ifelse(MAPPING=="MATH", POINTS, NA))
Error: incompatible types, expecting a numeric vector
知道我应该尝试什么吗?
答案 0 :(得分:3)
使用base,这看起来非常简单
df[as.character(df$MAPPING)] <- rep(df$POINTS, each = nrow(df))
df
# MAPPING PupilMatchingRefAnonymous POINTS PHYS COMP ENGL MATH
# 1 PHYS 1 60 60 40 20 80
# 2 COMP 1 40 60 40 20 80
# 3 ENGL 1 20 60 40 20 80
# 4 MATH 1 80 60 40 20 80
答案 1 :(得分:2)
我不确定dplyr如何处理合并,但这个base-R解决方案会产生结果(更少的名称,修复应该相当简单:)
merge(merge(dat, dat[dat$MAPPING=="MATH", -1], by='PupilMatchingRefAnonymous'),
dat[dat$MAPPING=="ENGL", -1] , by='PupilMatchingRefAnonymous')
#--------
PupilMatchingRefAnonymous MAPPING POINTS.x POINTS.y POINTS
1 1 PHYS 60 80 20
2 1 COMP 40 80 20
3 1 ENGL 20 80 20
4 1 MATH 80 80 20
这是一个用于进一步测试的两个学生数据集:
dput(dat)
structure(list(MAPPING = structure(c(4L, 1L, 2L, 3L, 4L, 1L,
2L, 3L), .Label = c("COMP", "ENGL", "MATH", "PHYS"), class = "factor"),
PupilMatchingRefAnonymous = c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), POINTS = c(60L, 40L, 20L, 80L, 20L, 40L, 0L, 80L)), .Names = c("MAPPING",
"PupilMatchingRefAnonymous", "POINTS"), class = "data.frame", row.names = c(NA,
-8L))
答案 2 :(得分:1)
我认为你试图将它从长格式转换为宽格式,对吗?
如果是这样,试试这个:
library(tidyr)
new.df <- comResults %>%
spread(MAPPING, POINTS)
这将使1名学生成为一排,他们的所有学术信息都在同一行。我知道你只想要数学和英语,但也许这段代码可以让你走上正轨。