合并带有空白纸的R中的excel文件时出错

时间:2019-02-07 06:34:10

标签: r

我正在使用以下代码来合并多个Excel文件和多个工作表。当它运行在与其他文件具有相同标题但未填充数据的工作表上时,出现错误。这是错误:

Error in data.frame(sub.id, condition, s.frame, ss) : 
  arguments imply differing number of rows: 0, 2

如何避免该错误?这是我在下面使用的代码。

file.names <- list.files(pattern='*.xls')
sheet.names <- getSheets(loadWorkbook('File.xls'))
sheet.names <-sheet.names[1:12]
e.names <- paste0(rep('v', 16), c(1:16))

data.1 <- data.frame(matrix(rep(NA,length(e.names)),
                            ncol = length(e.names)))
names(data.1) <- e.names

for (i in 1:length(file.names)) {
  wb <- loadWorkbook(file.names[i])
  for (j in 1:length(sheet.names)) {
    ss <- readWorksheet(wb, sheet.names[j], startCol = 2, header = TRUE)
    condition <- rep(sheet.names[j], nrow(ss))
    sub.id <- rep(file.names[i], nrow(ss))
    s.frame <- seq(1:nrow(ss))
    df.1 <- data.frame(sub.id, condition, s.frame, ss)
    names(df.1) <- e.names
    data.1 <- rbind(data.1, df.1)
    rm(ss, condition, s.frame, sub.id, df.1)
  }
  rm(wb)
}

2 个答案:

答案 0 :(得分:1)

我认为此解决方案将为您服务。它将指定文件夹中的所有.xlsx文件加载到列表列表中。工作表名称和-header应该不是问题。

library(openxlsx)

# Define folder where your files are
path_folder <- "C:/path_to_files/"

# load file names into a list
f <- list.files(path_folder)
f <- ifelse(substring(f,nchar(f)-4,nchar(f))==".xlsx",f,NA)
f <- f[!is.na(f)]
data_list <- as.list(f)

# get sheet-names
names(data_list) <- data_list
data_list <- lapply(data_list, function(x){getSheetNames(paste0(path_folder, x))})

# load data into a list of lists
data_list <- lapply(data_list, function(x){as.list(x)})
data_list <- lapply(names(data_list),function(x){
  sapply(data_list[[x]],function(y){read.xlsx(paste0(path_folder, x),sheet=y)})
})

# name the list elements
names(data_list) <- gsub(".xlsx", "", f)

最后得到一个列表(包含每个文件)列表(包含每个文件的图纸)。 从这里您可以删除空白工作表,并根据需要进行合并和编辑。

答案 1 :(得分:0)

添加了一个if语句,如果不跳过读入,则检查是否有多于一行,并且可以解决错误。

for (i in 1:length(file.names)) {
  wb <- loadWorkbook(file.names[i])
  for (j in 1:length(sheet.names)) {
    ss <- readWorksheet(wb, sheet.names[j], startCol = 2, header = TRUE)
    if (nrow(ss) > 1)
    {
    condition <- rep(sheet.names[j], nrow(ss))
    sub.id <- rep(file.names[i], nrow(ss))
    s.frame <- seq(1:nrow(ss))
    df.1 <- data.frame(sub.id, condition, s.frame, ss)
    names(df.1) <- e.names
    data.1 <- rbind(data.1, df.1)
    rm(ss, condition, s.frame, sub.id, df.1)
    }
  }
  rm(wb)
}