我目前在一个文件夹中有几个文件。它包含库存的每日更新。看起来像这样。
Onhand Harian 12 Juli 2019.xlsx
Onhand Harian 13 Juli 2019.xlsx
Onhand Harian 14 Juli 2019.xlsx... and so on.
我只想使用文件名上的日期来读取最新的excel文件。怎么做呢?提前谢谢
答案 0 :(得分:1)
我会做类似的事情:
library(stringr)
library(tidyverse)
x <- c("Onhand Harian 12 Juli 2019.xlsx",
"Onhand Harian 13 Juli 2019.xlsx",
"Onhand Harian 14 Juli 2019.xlsx")
lookup <- set_names(seq_len(12),
c("Januar", "Februar", "März", "April", "Mai", "Juni", "Juli",
"August", "September", "Oktober", "November", "Dezember"))
enframe(x, name = NULL, value = "txt") %>%
mutate(txt_extract = str_extract(txt, "\\d{1,2} \\D{3,9} \\d{4}")) %>% # September is longest ..
separate(txt_extract, c("d", "m", "y"), remove = FALSE) %>%
mutate(m = sprintf("%02d", lookup[m]),
d = sprintf("%02d", as.integer(d))) %>%
mutate(date = as.Date(str_c(y, m, d), format = "%Y%m%d")) %>%
filter(date == max(date)) %>%
pull(txt)
# "Onhand Harian 14 Juli 2019.xlsx"
答案 1 :(得分:0)
如果所有文件都包含相同的名称,则可以
#List all the file names in the folder
file_names <- list.files("/path/to/folder/", full.names = TRUE)
#Remove all unwanted characters and keep only the date
#Convert the date string to actual Date object
#Sort them and take the latest file
file_to_read <- file_names[order(as.Date(sub("Onhand Harian ", "",
sub(".xlsx$", "", basename(file_names))), "%d %B %Y"), decreasing = TRUE)[1]]
显然,如果您每天生成文件,您还可以使用file.info
根据文件的创建或修改时间来选择它们? the帖子中的详细信息。