Question

我希望R读取具有多个工作表的多个xlsx文件，每个文件的第一工作表都有一个标头（名称），但其余工作表没有任何标题，但是完全相同的列。

我在帖子中找到了解决方案：

dir_path <- "~/test_dir/"         # target directory path where the xlsx files are located. 
re_file <- "^test[0-9]\\.xlsx"    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc, but could simply be 'xlsx'.

read_sheets <- function(dir_path, file){
  xlsx_file <- paste0(dir_path, file)
  xlsx_file %>%
    excel_sheets() %>%
    set_names() %>%
    map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
    mutate(file_name = file) %>% 
    select(file_name, sheet_name, everything())
}

df <- list.files(dir_path, re_file) %>% 
  map_df(~ read_sheets(dir_path, .))

但是我不知道为什么它不起作用，我收到了这个错误。

set_names（。）中的错误：1个参数传递给'names <-'，需要2

先感谢

Answer 1

我创建了这个readxl解决方案，其中包含2个excel工作簿，每个工作簿都有2张工作表，每列相同。在您遇到的问题中，第二张（及以后）没有colnames，因此需要使用附加的if语句进行设置。它可能不是最快的解决方案，但是可以起作用：

library(readxl)    

#Set path
inputFolder <- "test/"

#Get list of files
fileList <- list.files(path = inputFolder, recursive=T, pattern='*.xlsx')

#Read in each sheet from each excel
for (f in 1:length(fileList)){
  #Find the number of sheets in this workbook
  sheetList <- excel_sheets(paste(inputFolder, fileList[f], sep = ""))
  #Get the sheets of this workbook
  for (s in 1:length(sheetList)) {
    tempSheet <- read_excel(paste(inputFolder, fileList[f], sep = ""), sheet = sheetList[s])
    if (f == 1 & s == 1) {
      df <- tempSheet
    }
    else {
      if(s != 1) {
        names(tempSheet) <- names(df)
      }
      df <- rbind(df,tempSheet)
    }
  }
}

Answer 2

这似乎可行。这是达到同一目的的另一种方法。

import * as PropTypes from 'prop-types'

将具有多个工作表的多个xlsx文件读入一个R数据帧-set_names函数问题

2 个答案: