我有工作代码,该代码根据参数排除列,并根据其他参数对某些列进行突变。这里有一个问题Can dplyr package be used for conditional mutating?,但没有解决条件选择
有没有if语句的纯dplyr代码的方法吗?
有效的R代码:
# Loading
diamonds_tbl <- diamonds
head(diamonds_tbl)
# parameters
initialColumnDrop <- c('x','y','z')
forceCategoricalColumns <- c('carat','cut', 'color')
forceNumericalColumns <- c('')
# Main Code
if(length(which(colnames(diamonds_tbl) %in% initialColumnDrop))>=1){
diamonds_tbl_clean <- diamonds_tbl %>%
select(-one_of(initialColumnDrop)) #Drop specific columns in columnDrop
}
if(length(which(colnames(diamonds_tbl_clean) %in% forceCategoricalColumns))>=1){
diamonds_tbl_clean <- diamonds_tbl_clean %>%
mutate_at(forceCategoricalColumns,funs(as.character)) #Force columns to be categorical
}
if(length(which(colnames(diamonds_tbl_clean) %in% forceNumericalColumns))>=1){
diamonds_tbl_clean <- diamonds_tbl_clean %>%
mutate_at(forceNumericalColumns,funs(as.numeric)) #Force columns to be numeric
}
答案 0 :(得分:2)
我不太了解“纯dplyr”解决方案的需求,但是可以使用辅助函数使任何问题变得更容易。例如,您可以编写一个仅在找到某些列的情况下才能运行转换的函数
run_if_cols_match <- function(data, cols, expr) {
if (any(names(data) %in% cols)) {
expr(data)
} else {
data
}
}
然后您可以在管道中使用它
diamonds_tbl_clean <- diamonds_tbl %>%
run_if_cols_match(initialColumnDrop,
. %>% select(-one_of(initialColumnDrop))) %>%
run_if_cols_match(forceCategoricalColumns,
. %>% mutate_at(forceCategoricalColumns,funs(as.character))) %>%
run_if_cols_match(forceNumericalColumns,
. %>% mutate_at(forceNumericalColumns,funs(as.numeric)))
将与您的代码执行相同的操作。这里只是有条件地运行不同的匿名管道。