Question

我在多个文件列表中使用lapply函数。有没有一种方法可以跳过当前文件上的函数而不返回任何内容，只是跳到文件列表中的下一个文件？

确切地说，我有一个检查条件的if语句，如果语句返回FALSE，我想跳到下一个文件。

Answer 1

lapply将始终返回与提供的X长度相同的列表。您只需将项目设置为稍后可以过滤掉的项目即可。

例如，如果您有函数parsefile

parsefile <-function(x) {
  if(x>=0) {
    x
  } else {
    NULL
  }
}

然后在向量runif(10,-5,5)

上运行它

result<-lapply(runif(10,-5,5), parsefiles)

然后您的列表中会填充答案和NULL s

您可以通过执行... {/ p>来对NULL进行子集化

result[!vapply(result, is.null, logical(1))]

Answer 2

正如其他人已经回答的那样，我认为如果不使用*apply系列函数返回内容，就不能进行下一次迭代。

在这种情况下，我使用Dean MacGregor的方法，只做了一些小改动：我使用NA代替NULL，这样可以更轻松地过滤结果。

files <- list("file1.txt", "file2.txt", "file3.txt")

parse_file <- function(file) {
  if(file.exists(file)) {
    readLines(file)
  } else {
    NA
  }
}

results <- lapply(files, parse_file)
results <- results[!is.na(results)]

快速基准

res_na   <- list("a",   NA, "c")
res_null <- list("a", NULL, "c")
microbenchmark::microbenchmark(
  na = res_na[!is.na(res_na)],
  null = res_null[!vapply(res_null, is.null, logical(1))]
)

说明NA解决方案比使用NULL的解决方案快得多：

Unit: nanoseconds
expr  min   lq    mean median   uq   max neval
  na    0    1  410.78    446  447  5355   100
null 3123 3570 5283.72   3570 4017 75861   100

Answer 3

您可以定义要在lapply()的通话中使用的自定义功能。下面是一些示例代码，它迭代文件列表并仅在名称不包含数字3时处理文件（有点人为，但希望这可以解决问题）：

files <- as.list(c("file1.txt", "file2.txt", "file3.txt"))

fun <- function(x) {
    test <- grep("3", x)                     // check for files with "3" in their name
    if (length(test) == 0) {                 // replace with your statement here
        // process the file here
    }
    // otherwise do not process the file
}

result <- lapply(files, function(x) fun(x))  // call lapply with custom function

R：lapply函数 - 跳过当前函数循环

3 个答案: