我认为解释我想要的最简单的方法是给您一个简单的例子。
这是我的模拟数据集:
example <- data.frame(location = c(rep('Location A', 5), rep('Location B', 4), rep('Location C', 7)),
factor_lvl = as.factor(c(paste('level', c(1:5), sep = ' '),
paste('level', c(1:4), sep = ' '),
paste('level', c(1:7), sep = ' '))),
no_answers = floor(runif(16, min=0, max=20) ))
因此看起来像这样
location factor_lvl no_answers
1 Location A level 1 1
2 Location A level 2 13
3 Location A level 3 4
4 Location A level 4 8
5 Location A level 5 6
6 Location B level 1 13
7 Location B level 2 17
8 Location B level 3 15
9 Location B level 4 7
10 Location C level 1 5
11 Location C level 2 8
12 Location C level 3 1
13 Location C level 4 19
14 Location C level 5 13
15 Location C level 6 18
16 Location C level 7 0
我想要的是汇总每个位置的答案数量,并重复此数字直到新的位置。例如,位置A有5个因子级别,答案总数为32,所以我希望新列以连续5个32开头,依此类推。
更清楚地说,所需的输出是这样的:
wanted_result <- cbind(example, total_answers = c(rep(32,5), rep(52, 4), rep(64, 7) ) )
location factor_lvl no_answers total_answers
1 Location A level 1 7 32
2 Location A level 2 16 32
3 Location A level 3 14 32
4 Location A level 4 8 32
5 Location A level 5 7 32
6 Location B level 1 0 52
7 Location B level 2 7 52
8 Location B level 3 5 52
9 Location B level 4 2 52
10 Location C level 1 13 64
11 Location C level 2 3 64
12 Location C level 3 10 64
13 Location C level 4 19 64
14 Location C level 5 13 64
15 Location C level 6 7 64
16 Location C level 7 2 64
答案 0 :(得分:1)
您只需要:
library(dplyr)
example %>%
group_by(location) %>%
mutate(total_answers = sum(no_answers))