Question

我正在尝试从使用的 多个工作表 的 多个工作簿 中读取数据。有10个工作簿，每个工作簿都有两个工作表中的数据。

以下代码可用于从第一张纸中提取数据。但是，我也想将数据提取到同一工作簿中的另一张纸上。我不确定如何在以下代码中指定工作表名称。

library(purrr)
library(readxl)
library(dplyr)
library(tidyr)

data_path <- "C:/Desktop/Test"

files <- dir(data_path, pattern = "*.xlsx")


weights_data <- data.frame(filename = files) %>%
               mutate(file_contents = map(filename,
                                                 ~ read_excel(file.path
                                                              (data_path,  .))))

View(unnest(weights_Data))

Answer 1

read_excel带有另一个参数，可让您指定特定的工作表：

sheet: Sheet to read. Either a string (the name of a sheet), or an
       integer (the position of the sheet). Ignored if the sheet is
       specified via 'range'. If neither argument specifies the
       sheet, defaults to the first sheet.

因此，我们需要扩展路径框架以包括工作表，只需使用readxl::excel_sheets即可轻松完成，library(tibble) library(dplyr) library(tidyr) library(purrr) library(readxl) data_frame( path = list.files(path = "~/StackOverflow/Prah/", pattern = "*.xlsx", full.names = TRUE) ) %>% mutate(sheets = map(path, excel_sheets)) # # A tibble: 3 x 2 # path sheets # <chr> <list> # 1 "C:\\Users\\r2/StackOverflow/Prah/mt1.xlsx" <chr [2]> # 2 "C:\\Users\\r2/StackOverflow/Prah/mt2.xlsx" <chr [2]> # 3 "C:\\Users\\r2/StackOverflow/Prah/mt3.xlsx" <chr [2]>对于单个路径将返回工作表名称的向量。

反复进行讨论/演练，尽管只需要最后一个块：

unnest

仅靠此并不能立即起作用，但是我们可以data_frame( path = list.files(path = "~/StackOverflow/Prah/", pattern = "*.xlsx", full.names = TRUE) ) %>% mutate(sheets = map(path, excel_sheets)) %>% unnest(sheets) # # A tibble: 6 x 2 # path sheets # <chr> <chr> # 1 "C:\\Users\\r2/StackOverflow/Prah/mt1.xlsx" Sheet1 # 2 "C:\\Users\\r2/StackOverflow/Prah/mt1.xlsx" Sheet2 # 3 "C:\\Users\\r2/StackOverflow/Prah/mt2.xlsx" Sheet1 # 4 "C:\\Users\\r2/StackOverflow/Prah/mt2.xlsx" Sheet2 # 5 "C:\\Users\\r2/StackOverflow/Prah/mt3.xlsx" Sheet1 # 6 "C:\\Users\\r2/StackOverflow/Prah/mt3.xlsx" Sheet2：

map2

现在应该清楚的是，我们现在只需要使用data_frame( path = list.files(path = "~/StackOverflow/Prah/", pattern = "*.xlsx", full.names = TRUE) ) %>% mutate(sheets = map(path, excel_sheets)) %>% unnest(sheets) %>% mutate(data = map2(path, sheets, ~ read_excel(path = .x, sheet = .y))) # # A tibble: 6 x 3 # path sheets data # <chr> <chr> <list> # 1 "C:\\Users\\r2/StackOverflow/Prah/mt1.xlsx" Sheet1 <tibble [32 x 11]> # 2 "C:\\Users\\r2/StackOverflow/Prah/mt1.xlsx" Sheet2 <tibble [32 x 11]> # 3 "C:\\Users\\r2/StackOverflow/Prah/mt2.xlsx" Sheet1 <tibble [32 x 11]> # 4 "C:\\Users\\r2/StackOverflow/Prah/mt2.xlsx" Sheet2 <tibble [32 x 11]> # 5 "C:\\Users\\r2/StackOverflow/Prah/mt3.xlsx" Sheet1 <tibble [32 x 11]> # 6 "C:\\Users\\r2/StackOverflow/Prah/mt3.xlsx" Sheet2 <tibble [32 x 11]>或类似的方法遍历每一行，就可以得到一个嵌套整齐的数据框：

mtcars

（我写了几张excel工作簿，每本都有两张纸，每张纸上都有sei();。没什么。）

r-如何将具有多个工作表的多个工作簿中的数据读取到R中？

1 个答案: