Question

我使用以下代码将一系列电子表格读入R.然而，我发现即使所有电子表格中的数据共享相同的标题和结构，一些电子表格也有多个表格。例如，在一个电子表格中，有两个表格，每个表格包含一些数据。我的问题是如何修改我的代码以从所有工作表中读取数据，而无需打开每个电子表格以找出其中有多少张。感谢。

library(readxl)

files <- Sys.glob("*.xlsx")
files

PL <- read_excel(files[1], col_names=TRUE)

for(i in 2:length(files)){

  x <- read_excel(files[i], col_names=TRUE)
  PL <- rbind(PL, x)
  print(i)

}

Answer 1

您可以使用excel_sheets包的功能readxl：

> library(readxl)
> sheets <- excel_sheets("xlsx_datasets.xlsx")
> sheets
[1] "iris"     "mtcars"   "chickwts" "quakes"  
> x <- read_excel("xlsx_datasets.xlsx", sheet=sheets[1])

即，阅读所有文件：

PL <- NULL
for(i in 1:length(files)){
  sheets <- excel_sheets(files[i])
  for(sheet in sheets){
    x <- read_excel(files[i], col_names=TRUE, sheet=sheet)
    PL <- rbind(PL, x)
  }
}

Answer 2

使用System.AppDomain.CurrentDomain.BaseDirectory，您可以使用System.Web.HttpContext.Current.Server.MapPath进行迭代

tidyverse

purrr为您提供文件中工作表的名称。你不必知道有多少。然后你命名那些床单。迭代每张工作表后，使用# you could use library(tidyverse) too which includes these two packages and more library(readxl) library(purrr) # for function map and set_names below list_xl <- map(files, ~.x %>% excel_sheets() %>% set_names() %>% map(read_excel, path = .x))读取它。开始时，excel_sheet适用于read_excel，您可以在每个文件上进行迭代以执行上一个过程。

最后，您将获得列表清单。您可以再次使用purrr::map包将结果放入您希望之后处理的表单中。

您可以在readxl website workflow page

上找到很好的例子

从每个电子表格中读取不同数量的表格

2 个答案: