如何将字符串分解为两部分(即列名及其值)

时间:2019-04-22 22:01:50

标签: r strip

我正在处理插入符号包的输出,其中列出了重要变量。现在,如果存在一个因子变量,则输出矩阵会将其作为columnnameValue。

我想将它的列名部分分开,以便可以对其进行一些分析。

df <- data.frame(col1 = c('life_stageAdult','books', 'bags', 'educationMasters'), col2 = c(100, 90, 80, 70))
original_column_names <- c('life_stage','books', 'bags', 'education', 'gender')

我希望我的输出是:

factot_cols = c('life_stage','education')

1 个答案:

答案 0 :(得分:0)

dataset <- data.frame(life_stage=rep(c("Adult","Child"),n=5),
                  books = c(1:10),
                  bags = rep(c(1,0),n=5),
                  education = rep(c("Bachelors","Masters"),n=5))
# List of variables you entered in the model
model_vars <- c("life_stage","books","bags","education")

levels_dataset <- dataset %>%
 select(model_vars) %>% 
 summarise_each(funs(as.numeric(length(levels(.))))) %>% 
 unlist()

levels_dataset <- ifelse(levels_dataset==0,1,levels_dataset-1)

names_dataset <- rep(names(dataset),levels_dataset)

#Your model output 'df' with columns

df <- data.frame(col1 = c('life_stageAdult','books', 'bags', 'educationMasters'), 
             col2 = c(100, 90, 80, 70))

df <- data.frame(df,names_dataset) %>% 
  mutate(level = str_replace(col1,
                         pattern = as.character(names_dataset),""))