我想根据cut语句的宽度动态生成列。
如何像下面的示例那样动态生成AGE1到AGEn?
config.assets.debug = true
答案 0 :(得分:1)
创建一个函数。它包含一个for循环解决方案
cut_function <- function(df, num_cuts) {
num_by <- num_cuts
df_out <- df %>%
mutate(AGEGROUP = cut(AGE, breaks = seq(10, 20, by = num_by), right = F)) %>%
group_by(AGEGROUP) %>%
summarise(SUM.NUM = sum(NUM)) %>%
mutate(AGELOW = as.numeric(substr(as.character(AGEGROUP), 2, 3)))
# generate AGEn from 1:(num_by-1)
for(i in 2:num_by-1) {
# this is the core of the function
# it assigns a new column based on the index i
# i depends on the length of your num_by
df_out[[paste0('AGE',i)]] <- df_out$AGELOW + i
df_out
}
df_out %>% select(-AGEGROUP) %>%
gather(AGE, value, AGELOW:paste0('AGE',num_by-1), -c(SUM.NUM))
}
测试
cut_function(df,2)
# A tibble: 10 x 3
SUM.NUM AGE value
<dbl> <chr> <dbl>
1 0.311 AGELOW 10
2 -3.43 AGELOW 12
3 -0.237 AGELOW 14
4 1.82 AGELOW 16
5 0.332 AGELOW 18
6 0.311 AGE1 11
7 -3.43 AGE1 13
8 -0.237 AGE1 15
9 1.82 AGE1 17
10 0.332 AGE1 19
cut_function(df,3)
# A tibble: 12 x 3
SUM.NUM AGE value
<dbl> <chr> <dbl>
1 -2.56 AGELOW 10
2 -0.799 AGELOW 13
3 1.58 AGELOW 16
4 0.569 AGELOW NA
5 -2.56 AGE1 11
6 -0.799 AGE1 14
7 1.58 AGE1 17
8 0.569 AGE1 NA
9 -2.56 AGE2 12
10 -0.799 AGE2 15
11 1.58 AGE2 18
12 0.569 AGE2 NA
但是
从数据帧中查看所需的输出,我认为有一种更轻松的方法来获取所需的内容。只需在通话中将summarise
替换为mutate
:
df %>%
mutate(AGEGROUP = cut(AGE, breaks = seq(10, 20, by = num_by), right = F)) %>%
group_by(AGEGROUP) %>%
mutate(SUM.NUM = sum(NUM))
#gives basically exactly the same output as your df_out2
# A tibble: 10 x 4
# Groups: AGEGROUP [5]
AGE NUM AGEGROUP SUM.NUM
<int> <dbl> <fct> <dbl>
1 10 0.463 [10,12) 0.311
2 11 -0.151 [10,12) 0.311
3 12 -2.87 [12,14) -3.43
4 13 -0.562 [12,14) -3.43
5 14 -0.276 [14,16) -0.237
6 15 0.0392 [14,16) -0.237
7 16 1.99 [16,18) 1.82
8 17 -0.168 [16,18) 1.82
9 18 -0.236 [18,20) 0.332
10 19 0.569 [18,20) 0.332
您可以创建上述函数,而无需for循环。