Question

我有100年的月度数据，其中每个月都是一个文件，文件名以数据的年份和月份结束。

e.g。 “cru_ts_3_10.1901.2009.pet_1901_1.asc”是1901年第1个月（1月）的档案。

问题是，当我列出我的文件时，文件的顺序发生变化，第10,11和12个月在1之后出现：

files <- list.files(pattern=".asc") 
head(files)

[1] "cru_ts_3_10.1901.2009.pet_1901_1.asc"  "cru_ts_3_10.1901.2009.pet_1901_10.asc" "cru_ts_3_10.1901.2009.pet_1901_11.asc"
[4] "cru_ts_3_10.1901.2009.pet_1901_12.asc" "cru_ts_3_10.1901.2009.pet_1901_2.asc"  "cru_ts_3_10.1901.2009.pet_1901_3.asc"

我可以看到为什么会发生这种情况，但如何以正确的月订单导入我的数据呢？

Answer 1

files <- c("cru_ts_3_10.1901.2009.pet_1901_1.asc", 
           "cru_ts_3_10.1901.2009.pet_1901_10.asc", 
           "cru_ts_3_10.1901.2009.pet_1901_11.asc", 
           "cru_ts_3_10.1901.2009.pet_1901_12.asc", 
           "cru_ts_3_10.1901.2009.pet_1901_2.asc", 
           "cru_ts_3_10.1901.2009.pet_1901_3.asc",
           "cru_ts_3_10.1901.2009.pet_1902_1.asc",
           "cru_ts_3_10.1901.2009.pet_1902_10.asc", 
           "cru_ts_3_10.1901.2009.pet_1902_11.asc")

这会拆分下划线上的名称，并选择最后一部分。（例如“1.asc”）并使用sub删除“.asc”。然后它将剩下的内容转换为数字，并在数字上使用sprintf来获得2个字符（数字）字符串。然后它将年和月变成一个数字order s。基于此。

files[order(sapply(strsplit(files, "_"), function(x) {
    m <- sprintf("%02d", as.numeric(sub(".asc", "", last(x)))) # turns "1.asc" into "01"
    as.numeric(paste(x[length(x) - 1], m, sep=""))
}))]

返回：

[1] "cru_ts_3_10.1901.2009.pet_1901_1.asc" 
[2] "cru_ts_3_10.1901.2009.pet_1901_2.asc" 
[3] "cru_ts_3_10.1901.2009.pet_1901_3.asc" 
[4] "cru_ts_3_10.1901.2009.pet_1901_10.asc"
[5] "cru_ts_3_10.1901.2009.pet_1901_11.asc"
[6] "cru_ts_3_10.1901.2009.pet_1901_12.asc"
[7] "cru_ts_3_10.1901.2009.pet_1902_1.asc" 
[8] "cru_ts_3_10.1901.2009.pet_1902_10.asc"
[9] "cru_ts_3_10.1901.2009.pet_1902_11.asc"

Answer 2

另一个基于regex的解决方案。它的工作原理是从文件名中提取年份和月份以构建实际日期，然后使用排序顺序打印文件列表。

pat <- "^.*pet_([0-9]{1,})_([0-9]{1,}).asc$"
ord_files <- as.Date(gsub(pat, sprintf("%s-%s-01", "\\1", "\\2"), files))
files[order(ord_files)]

<强>说明

我们使用正则表达式来匹配文件名中的year和month。因此，\\1匹配year和\\2匹配的月份。我们仍然需要将其转换为日期。声明sprintf("%s-%s-01", \ 1 , \ 2 )替换year和month的值代替%s。将{1}}转换为日期需要as.Date。

Answer 3

查看mixedsort包中的gtools函数。

如何使用list.files以正确的月订单导入数据

3 个答案: