我在数据中有一个列列表,我希望将其从字符转换为有序因子。我当前的解决方案是这种可怕的丑陋结构,该结构虽然有效,但确实使眼睛有点烫伤:
load_from_file <- function(filename) {
d <- read.csv(filename)
d <- d[,2:37]
d %<>% na_if("")
for(column in alwaystonever_questions) {
eval(parse(text=paste('d$',column,' <- factor(d$',column,', ordered=TRUE,levels=c("Never","Rarely","Sometimes","Often","Always"))',sep="")))
}
d$HowAreYouFeeling <- factor(d$HowAreYouFeeling,ordered=TRUE,levels=c("Bad","NotSoGood","Ok","Good","Great"))
d %<>% mutate_if(is.character,as.factor)
return(d)
}
我希望改为使用一个简单的系列“%>%”来完成此操作,希望可以提高可读性。如何以更惯用的方式做到这一点?
答案 0 :(得分:4)
这是一个tidyverse
解决方案。假定alwaystonever_questions
是一个字符向量。我省略了as.factor
部分,因为factor
应该足够了(但是如果它不起作用,请尝试再次添加它,我始终不确定因素):< / p>
library(dplyr)
load_from_file <- function(filename) {
read.csv(filename) %>%
select(2:37) %>%
mutate(across(everything(), na_if, "")) %>%
mutate(across(contains(alwaystonever_questions),
~factor(.x, ordered = TRUE,
levels = c("Never","Rarely","Sometimes","Often","Always"))),
HowAreYouFeeling = factor(HowAreYouFeeling,
ordered = TRUE,
levels=c("Bad","NotSoGood","Ok","Good","Great")))
}
如果您必须读很多文件,则可以执行以下操作:
library(purrr)
filenames <- list.files("path_to_directory")
list_dfs <- set_names(filenames) %>%
map(load_from_file)