我正在处理插入符号包的输出,其中列出了重要变量。现在,如果存在一个因子变量,则输出矩阵会将其作为columnnameValue。
我想将它的列名部分分开,以便可以对其进行一些分析。
df <- data.frame(col1 = c('life_stageAdult','books', 'bags', 'educationMasters'), col2 = c(100, 90, 80, 70))
original_column_names <- c('life_stage','books', 'bags', 'education', 'gender')
我希望我的输出是:
factot_cols = c('life_stage','education')
答案 0 :(得分:0)
dataset <- data.frame(life_stage=rep(c("Adult","Child"),n=5),
books = c(1:10),
bags = rep(c(1,0),n=5),
education = rep(c("Bachelors","Masters"),n=5))
# List of variables you entered in the model
model_vars <- c("life_stage","books","bags","education")
levels_dataset <- dataset %>%
select(model_vars) %>%
summarise_each(funs(as.numeric(length(levels(.))))) %>%
unlist()
levels_dataset <- ifelse(levels_dataset==0,1,levels_dataset-1)
names_dataset <- rep(names(dataset),levels_dataset)
#Your model output 'df' with columns
df <- data.frame(col1 = c('life_stageAdult','books', 'bags', 'educationMasters'),
col2 = c(100, 90, 80, 70))
df <- data.frame(df,names_dataset) %>%
mutate(level = str_replace(col1,
pattern = as.character(names_dataset),""))