我有这个给定的数据框:
days classtype scores
1 1 a 49
2 1 b 47
3 2 a 36
4 2 b 41
这是由这个给定的代码产生的:
days=c(1,1,2,2)
classtype=c("a","b","a","b")
scores=c(49,47,36,41)
myData=data.frame(days,classtype,scores)
print(myData)
为了计算每天两个班级的得分差异,我需要在代码中添加哪些行?我想得到这个输出:
days difference_in_scores
1 1 2
2 2 -5
答案 0 :(得分:2)
如果您的数据格式与您所显示的一致,那么您可以使用data.table
非常巧妙地完成此操作:
setDT(myData)
myData[, diff(scores), by = days]
days V1
1: 1 -2
2: 2 5
或仅使用 base-R :
aggregate(scores ~ days, myData, FUN = diff)
答案 1 :(得分:1)
你可以采取的一种方法
library(dplyr)
library(reshape2)
days=c(1,1,2,2)
classtype=c("a","b","a","b")
scores=c(49,47,36,41)
myData=data.frame(days,classtype,scores)
myData %>%
# convert the data to wide format
dcast(days ~ classtype,
value.var = "scores") %>%
# calculate differences
mutate(difference_in_scores = a - b) %>%
# remove columns (just to match your desired output)
select(days, difference_in_scores)