数据集HAVE
是来自 Recess 字符的电话数据的边缘列表:
Student Friend nCalls
TJ Spinelli 3
TJ Gretchen 7
TJ Gus 6
TJ Vince 8
TJ King Bob 1
TJ Mikey 2
Spinelli TJ 3
Spinelli Vince 2
Randall Ms. Finster 17
数据集NEED
包含HAVE
中的所有原始列,但包含一个新变量nCallsPerStudent
,其确切含义如下:
Student Friend nCalls nCallsPerStudent
TJ Spinelli 3 27
TJ Gretchen 7 27
TJ Gus 6 27
TJ Vince 8 27
TJ King Bob 1 27
TJ Mikey 2 27
Spinelli TJ 3 5
Spinelli Vince 2 5
Randall Ms. Finster 17 17
如何从HAVE
到NEED
?
答案 0 :(得分:2)
我们可以按“学生”和mutate
分组以创建新列
library(dplyr)
df %>%
group_by(Student) %>%
mutate(nCallsPerStudent = sum(nCalls))
或使用base R
df$nCallsPerStudent <- with(df, ave(nCalls, Student, FUN = sum))