根据其他列值对动态列名称进行突变

时间:2020-04-15 10:54:30

标签: r dataframe dplyr mutate

这是我构建df的方法

df_test <- data.frame(ID = c("1100455", "1100455", "1100464", "1100464"),
                      CATEGORIE = c("10110", "90310", "10110","10140"),
                      RANK =  c( "1", "0" ,"1", "1"),
                      MD =  c("0", "0", "0", "3" ),
                      PROD3 = c(2345.00,1114.58,501.40,125.15))

newvars <- paste("aa",df_test[["CATEGORIE"]],df_test[["RANK"]],df_test[["MD"]],"ta", sep="_")
newvars <- unique(newvars)
newvars
df_test[newvars] <- NA

实际数据框:

       ID CATEGORIE RANK MD   PROD3 aa_10110_1_0_ta aa_90310_0_0_ta aa_10140_1_3_ta
1 1100455     10110    1  0 2345.00              NA              NA              NA
2 1100455     90310    0  0 1114.58              NA              NA              NA
3 1100464     10110    1  0  501.40              NA              NA              NA
4 1100464     10140    1  3  125.15              NA              NA              NA

我尝试根据其他列的值更新列,

df_test <- df_test %>% mutate(get(paste0("aa_", CATEGORIE,"_",RANK,"_",MD,"_ta" ) = PROD3 )))

我无法设法使它起作用,我想知道是否只有mutate才可能,或者我是否必须使用单独的功能?

预期结果:

       ID CATEGORIE RANK MD   PROD3 aa_10110_1_0_ta aa_90310_0_0_ta aa_10140_1_3_ta
1 1100455     10110    1  0 2345.00          2345.0              NA              NA
2 1100455     90310    0  0 1114.58              NA         1114.58              NA
3 1100464     10110    1  0  501.40           501.4              NA              NA
4 1100464     10140    1  3  125.15              NA              NA          125.15

1 个答案:

答案 0 :(得分:1)

使用str_c(或paste)创建新的列名,然后pivot_wider

library(tidyverse)

df_test %>%
  select_at(vars(-starts_with("aa"))) %>%  ## reset your data
  mutate(name = str_c("aa", CATEGORIE, RANK, MD, "ta", sep = "_"),
         value = PROD3) %>% 
  pivot_wider()

#        ID CATEGORIE RANK MD   PROD3 aa_10110_1_0_ta aa_90310_0_0_ta aa_10140_1_3_ta
# 1 1100455     10110    1  0 2345.00          2345.0              NA              NA
# 2 1100455     90310    0  0 1114.58              NA         1114.58              NA
# 3 1100464     10110    1  0  501.40           501.4              NA              NA
# 4 1100464     10140    1  3  125.15              NA              NA          125.15

spread()版本

如果使用spread(),则可以保存select_at()的行,因为spread()会覆盖重复的列名。

df_test %>%
  mutate(name = str_c("aa", CATEGORIE, RANK, MD, "ta", sep = "_"),
         value = PROD3) %>%
  spread(name, value)