如何获得按性别和状态分组的Cumsum表?
Gender = sample(c('male', 'female'), 100, replace=TRUE)
State = sample(c('CA', 'WA', 'NV', 'OR', "AZ"), 100, replace=TRUE)
Number = sample(1:8, size=100, replace=TRUE)
df <- data.frame(Gender,State, Number)
答案 0 :(得分:1)
对于更简单的方法,我建议使用dplyr。加载tidyverse时,Dplyr会与一堆其他有用的包一起加载。
library(tidyverse)
Gender = sample(c('male', 'female'), 100, replace=TRUE)
State = sample(c('CA', 'WA', 'NV', 'OR', "AZ"), 100, replace=TRUE)
Number = sample(1:8, size=100, replace=TRUE)
df <- data.frame(Gender,State, Number)
df <- df %>%
group_by(Gender, State) %>%
mutate(Number_CumSum = cumsum(Number)) %>%
ungroup() %>%
arrange(State, Gender)
head(df)
# A tibble: 6 x 4
Gender State Number Number_CumSum
<fctr> <fctr> <int> <int>
1 female AZ 8 8
2 female AZ 3 11
3 female AZ 4 15
4 female AZ 5 20
5 female AZ 2 22
6 female AZ 7 29
答案 1 :(得分:1)
如果我们正在寻找cumsum表,那么
library(data.table)
dcast(setDT(df)[, .N, .(Gender, State, Number)
][, perc := round(100*N/sum(N), 2), .(Gender, State)],
Gender + State ~Number, value.var = 'perc', fill = 0, drop = FALSE)[,
(3:10) := lapply(Reduce(`+`, .SD, accumulate = TRUE),
function(x) paste0(x, "%")), .SDcols = -(1:2)][]