使用地图功能将多个Excel文件读入R

时间:2018-09-13 15:04:17

标签: r excel dataframe purrr

我知道也有类似的问题,但是我还没有遇到purrr包中使用map函数的问题。我在尝试使用purrr::map()读取一些excel文件(.xlsx)时遇到了困难。我希望每个都是自己的数据帧。我在类似的问题How can I reading multiple (excel) files into R?中尝试了这种方法。

但是,我不断收到此错误:

  

错误:路径不存在:“ tab3_DOfinal_HUClevel_assesssment.xlsx”

我确定我有正确的道路。不知道为什么我得到这个错误。我有大约9个要阅读的Excel电子表格。

我尝试过的代码:

# load necessary package
library(purrr)

file.list <- list.files(path="2016_Data_Tables",pattern='*.xlsx')
file.list <- setNames(file.list, file.list)

# store all .xlsx files as individual data frames inside of one list
df <- map(file.list, read_xlsx)

文件名模式如下:

tab3_DOfinal_HUClevel_assessment.xlsx

唯一改变的是DOfinal部分。

一些示例数据:

structure(list(ID = 1, WMA = 15, Number = "02040302020030-01", 
    HUC14 = "HUC02040302020030", Name = "Absecon Creek (AC Reserviors) (gage to SB)", 
    Region = "Atlantic Coast", NumofStations = "2", ListofStations = "01410455, R32", 
    ListofAssessment = "2, 2", HUCTier = "2", swqs = "PL, SE1", 
    TotalNumSamples5yrs = "NA", flgusgsprelim = "NA, 0", auassess = 2, 
    auassesstrout = -999, finalauassess = 2, finalauassesstrout = -999, 
    Changefrom2014 = "No Change-2", Changetroutfrom2014 = "No Change", 
    listHUC14assess5 = "NA", listHUC14assess3 = "NA", listHUC14assess2 = "01410455, R32", 
    His2014 = "Attaining", His2014trout = "-999", Notes = NA_character_, 
    OldStations2014 = "01410455", OldStationsAssess2014 = "2", 
    Error = NA_character_), .Names = c("ID", "WMA", "Number", 
"HUC14", "Name", "Region", "NumofStations", "ListofStations", 
"ListofAssessment", "HUCTier", "swqs", "TotalNumSamples5yrs", 
"flgusgsprelim", "auassess", "auassesstrout", "finalauassess", 
"finalauassesstrout", "Changefrom2014", "Changetroutfrom2014", 
"listHUC14assess5", "listHUC14assess3", "listHUC14assess2", "His2014", 
"His2014trout", "Notes", "OldStations2014", "OldStationsAssess2014", 
"Error"), row.names = c(NA, -1L), class = c("tbl_df", "tbl", 
"data.frame"))


structure(list(WMA = 15, Number = "02040302020030-01", HUC14 = "HUC02040302020030", 
    Name = "Absecon Creek (AC Reserviors) (gage to SB)", Region = "Atlantic Coast", 
    NumofStations = "1", ListofStations = "01410455", ListofAssessment = "2", 
    MaxStaAssessment = "2", MinStaAssessment = "2", TotalNumSamples5yrs = "NA", 
    auassess = "2", ChangeFrom2014 = "No Change-2", liststaassess2 = "01410455", 
    liststaassess3 = "NA", liststaassess5 = "NA", Assessment2014 = "Attaining", 
    Comments = NA_character_), .Names = c("WMA", "Number", "HUC14", 
"Name", "Region", "NumofStations", "ListofStations", "ListofAssessment", 
"MaxStaAssessment", "MinStaAssessment", "TotalNumSamples5yrs", 
"auassess", "ChangeFrom2014", "liststaassess2", "liststaassess3", 
"liststaassess5", "Assessment2014", "Comments"), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

structure(list(WMA = 15, Number = "02040302020030-01", HUC14 = "HUC02040302020030", 
    Name = "Absecon Creek (AC Reserviors) (gage to SB)", Region = "Atlantic Coast", 
    NumofStations = "1", ListofStations = "R32", ListofAssessment = "3", 
    MaxStaAssessment = "3", MinStaAssessment = "3", TotalNumSamples5yrs = "9", 
    auassess = "3", ChangeFrom2014 = "No Change-3", liststaassess2 = "NA", 
    liststaassess3 = "R32", liststaassess5 = "NA", Assessment2014 = "N/A", 
    Comments = NA_character_), .Names = c("WMA", "Number", "HUC14", 
"Name", "Region", "NumofStations", "ListofStations", "ListofAssessment", 
"MaxStaAssessment", "MinStaAssessment", "TotalNumSamples5yrs", 
"auassess", "ChangeFrom2014", "liststaassess2", "liststaassess3", 
"liststaassess5", "Assessment2014", "Comments"), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

1 个答案:

答案 0 :(得分:0)

Aurèle很好地说明了您的文件路径。

我希望每个人都是自己的数据框

如果这是目标,那么purrr::iwalkassign的组合可以轻松地带您到达目的地。该过程如下:

  1. 获取位于.xlsx中的所有2016_Data_Tables/文件的列表。
  2. 然后使用purrr::set_names来命名此列表中的每个元素,其文件名不带.xlsx扩展名。
  3. 然后使用purrr::iwalkassign函数应用于列表中的每个元素。具体来说,使用read_xlsx将磁盘中的每个.xlsx文件读入数据帧,然后将该数据帧作为命名对象分配给R的全局环境
list.files('data/mpg', pattern = '.xlsx', full.names = T) %>% 
  purrr::set_names(stringr::str_remove(basename(.), '.xlsx$')) %>% 
  purrr::iwalk(function(x, i) assign(i, readxl::read_xlsx(x), .GlobalEnv))