对r

时间:2019-06-19 14:53:52

标签: r grep tidyverse

我使用grep()选择一些特定变量来进行其他一些失败的计算。 然后我创建了一个新变量,该变量由变量名和“ +”组成,而不是值的总和。

# create a df
test <- data.frame(I60_freq_t = 1,
                   I60_freq_man = 1,
                   I60_freq_woman = 1,
                   I60_freq_lo65 = 1,
                   I60_freq_hi65 = 1,
                   I61_freq_t = 1,
                   I61_freq_man = 1,
                   I61_freq_woman = 1,
                   I61_freq_lo65 = 1,
                   I61_freq_hi65 = 1,
                   I62_freq_t = 1,
                   I62_freq_man = 1,
                   I62_freq_woman = 1,
                   I62_freq_lo65 = 1,
                   I62_freq_hi65 = 1
                   )

# extract variables with different end words and use " + " to concatenate
end_with_t <- grep('t$', names(test), value = T) %>% paste(collapse = '+')
end_with_man <- grep('[^a-z]man$', names(test), value = T) %>% paste(collapse = '+')
end_with_woman <- grep('woman$', names(test), value = T) %>% paste(collapse = '+')
end_with_lo65 <- grep('lo65$', names(test), value = T) %>% paste(collapse = '+')
end_with_hi65 <- grep('hi65$', names(test), value = T) %>% paste(collapse = '+')

# sum the value 
test2 <- test %>% mutate(t = end_with_t,
                         man = end_with_man,
                         woman = end_with_woman,
                         lo65 = end_with_lo65,
                         hi65 = end_with_hi65) 
# **** What I want is sum the value not sum the variables names *********

我的问题是:

1。如何修改我的代码以获得我想要的?

2。有更好的方法吗?

我们将不胜感激!

1 个答案:

答案 0 :(得分:2)

这是一个使用map_dfc遍历变量名并使用rowSums将它们全部加在一起的想法。 ends_with是一种基于字符串结尾选择变量的方法。

library(tidyverse)

variables <- c("_t", "_man", "_woman", "_lo65", "_hi65")

test2 <- map_dfc(variables, ~test %>% 
          select(ends_with(.x)) %>%
          rowSums()) %>%
  setNames(str_remove(variables, fixed("_")))

test2
# A tibble: 1 x 5
      t   man woman  lo65  hi65
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     3     3     3     3     3