Question

我正在编写一个Shiny应用程序，用户将在其中输入样品条件的数据，脚本将“自动”将其输入条件与给定文件的样品名称匹配。

为简单起见，我将不包含闪亮的代码，因为我仅在实际的R实现中苦苦挣扎。

如果我已经知道潜在的状况，可以执行以下操作：

library(tidyverse)
x <- data.frame(Samples = c('Low1', 'Low2', 'High1', 'High2', 
                           'Ctrl1', 'Ctrl2'))

x <- x %>% mutate(Conditions = case_when(
           str_detect(Samples, fixed("low", ignore_case = T)) ~ "low",
           str_detect(Samples, fixed("high", ignore_case = T)) ~ "high",
           str_detect(Samples, fixed("ctrl", ignore_case = T)) ~ "ctrl"))

我会得到我想要的东西，像这样的数据框：

Samples    Conditions
   Low1           low
   Low2           low
  High1          high
  High2          high
  Ctrl1          ctrl
  Ctrl2          ctrl

但是，我想遍历潜在条件的向量并做类似的事情：

library(tidyverse)
condition_options <- c('low', 'high', 'ctrl')

x <- data.frame(Samples = samplenames)
for (j in condition_options) {
   x <- x %>% mutate(Condition = case_when(
        str_detect(Samples, fixed(j, ignore_case = T)) ~ j)) 
    }

执行此操作时，Condition列将被重写，仅使我与向量中的最后一个值匹配。例如：

Samples    Conditions
   Low1         <NA>
   Low2         <NA>
  High1         <NA>
  High2         <NA>
  Ctrl1         ctrl
  Ctrl2         ctrl

Answer 1

如果您使用元编程而不是循环来构建case_when语句的所有部分，则可能会更容易。试试

library(tidyverse)
condition_options <- c('low', 'high', 'ctrl')

conditions <- purrr::map(condition_options, 
                         ~quo(str_detect(Samples, fixed(!!.x, ignore_case = T))~!!.x))

x <- data.frame(Samples = samplenames)
x %>% mutate(Condition = case_when(!!!conditions) )

#   Samples Condition
# 1    Low1       low
# 2    Low2       low
# 3   High1      high
# 4   High2      high
# 5   Ctrl1      ctrl
# 6   Ctrl2      ctrl

这里map构建了您希望在case_when语句中拥有的所有不同公式。然后，我们使用!!!将其插入到mutate表达式中。

Answer 2

library(purrr)
x <- data.frame(Samples = c('Low1', 'Low2', 'High1', 'High2', 
                            'Ctrl1', 'Ctrl2'))
condition_options <- c('low', 'high', 'ctrl')

# iterate through all provided `condition_options `, returns corresponding condition if a match is found, otherwise returns NA
matched_values <- map(condition_options,function(condition_name){
    ifelse(
        str_detect(x$Samples,fixed(condition_name,ignore_case = TRUE)),
        condition_name,
        NA_character_
    )
})

# if all values are NA, still return NA, otherwise return matched value, it will throw an error if multiple matches are found.
x["Conditions"] <- pmap_chr(values, function(...){
    values <- unlist(list(...))
    if(all(is.na(values))){
        return(NA)
    } else {
        return(values[!is.na(values)])
    }
})

> x
  Samples Conditions
1    Low1        low
2    Low2        low
3   High1       high
4   High2       high
5   Ctrl1       ctrl
6   Ctrl2       ctrl

Answer 3

我认为您不需要循环即可执行此操作。我们可以使用str_extract提取任何与condition_options中的模式匹配的值

x$Conditions <- stringr::str_extract(tolower(x$Samples), 
                         paste0(condition_options, collapse = "|"))

x
#  Samples Conditions
#1    Low1        low
#2    Low2        low
#3   High1       high
#4   High2       high
#5   Ctrl1       ctrl
#6   Ctrl2       ctrl

在基数R中，我们还可以使用paste0

动态生成正则表达式

x$Conditions <- sub(paste0(".*(", paste0(condition_options, collapse = "|"), ").*"),
                "\\1", tolower(x$Samples))

其中

paste0(".*(", paste0(condition_options, collapse = "|"), ").*") #gives
#[1] ".*(low|high|ctrl).*"

如何在for循环中使用mutate（）和case_when（）？

3 个答案: