在R中的循环中子集多个数据帧

时间:2016-09-25 19:35:39

标签: r file loops subset multiple-columns

我正在尝试从已导入的20多个数据框中删除列。但是,当我尝试遍历所有这些文件时,我收到错误。当我对单个文件名进行硬编码时,我能够删除,但是当我尝试遍历所有文件时,我就会出错。这是代码:

path <- "C://Home/Data/"
files <- list.files(path=path, pattern="^.file*\\.csv$")

for(i in 1:length(files))
{
  perpos <- which(strsplit(files[i], "")[[1]]==".")
  assign(
    gsub(" ","",substr(files[i], 1, perpos-1)), 
    read.csv(paste(path,files[i],sep="")))
}

mycols <- c("test," "trialruns," "practice")

`file01` = `file01`[,!(names(`file01`) %in% mycols)]

因此,上面的工作将从file01中删除这三列。但是,我无法通过files02迭代到files20并从所有这些列中删除列。有任何想法吗?非常感谢你!

1 个答案:

答案 0 :(得分:0)

正如@ zx8754所提到的,考虑lapply()维护一个编译列表中的所有数据帧而不是环境中的多个对象(但下面还包括如何从列表中输出单个dfs):

path <- "C://Home/Data/"
files <- list.files(path=path, pattern="^.file*\\.csv$")
mycols <- c("test," "trialruns," "practice")

# READ IN ALL FILES AND SUBSET COLUMNS
dfList <- lapply(files, function(f) {  
   read.csv(paste0(path, f))[mycols]
})

# SET NAMES TO EACH DF ELEMENT
dfList <- setNames(dfList, gsub(".csv", "", files))

# IN CASE YOU REALLY NEED INDIVIDUAL DFs
list2env(dfList, envir=.GlobalEnv)

# IN CASE YOU NEED TO APPEND ALL DFs
finaldf <- do.call(rbind, dfList)

# TO RETRIEVE FIRST DF
dfList[[1]]  # OR dfList$file01