我对list.files()
函数有一个简单的问题。我有一个文件夹,其中包含以这种方式命名的文件列表:
DF2.txt
DF3.txt
DF4.txt
DF5.txt
.......
.......
当我粘贴以下字符串时,
files <- list.files(pattern = ".txt")
向量按此顺序返回值:
“DF10.txt”
“DF11.txt”
“DF12.txt”
........
........
“DF2.txt”
“DF20.txt”
“DF21.txt”
.........
.........
“DF3.txt”
“DF30.txt”
“DF31.txt”
..........
..........
等等。我想按文件夹中出现的数字递增顺序列出文件。为什么R会在list.files()
之后更改文件夹中文件的顺序,如何重新排列这些文件以匹配原始订单?
答案 0 :(得分:19)
就计算机而言, 正确排序。但是,您可以使用“gtools”包中的mixedsort
来获取所需的排序类型:
> myFiles <- paste("file", 1:20, ".txt", sep = "")
> sort(myFiles)
[1] "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt"
[7] "file16.txt" "file17.txt" "file18.txt" "file19.txt" "file1.txt" "file20.txt"
[13] "file2.txt" "file3.txt" "file4.txt" "file5.txt" "file6.txt" "file7.txt"
[19] "file8.txt" "file9.txt"
> library(gtools)
> mixedsort(sort(myFiles))
[1] "file1.txt" "file2.txt" "file3.txt" "file4.txt" "file5.txt" "file6.txt"
[7] "file7.txt" "file8.txt" "file9.txt" "file10.txt" "file11.txt" "file12.txt"
[13] "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt" "file18.txt"
[19] "file19.txt" "file20.txt"
通过您的示例,这意味着您可以:
files <- list.files(pattern = ".txt")
library(gtools)
files <- mixedsort(files)
由于编写小实用程序函数很容易,你也可以写一个像这样的小函数:
ListFiles <- function(pattern = ".txt") {
require(gtools)
myFiles <- list.files(pattern = pattern, )
mixedsort(myFiles)
}
然后,比较:
list.files(pattern = ".txt")
ListFiles(pattern = ".txt")
答案 1 :(得分:8)
数字按字母顺序排序。对于基础R方法,您可以执行以下操作:
dat = sort(paste("DF", 1:100, ".txt", sep = ""))
numbers = as.numeric(regmatches(dat, regexpr("[0-9]+", dat)))
dat[order(numbers)]
[1] "DF1.txt" "DF2.txt" "DF3.txt" "DF4.txt" "DF5.txt" "DF6.txt"
[7] "DF7.txt" "DF8.txt" "DF9.txt" "DF10.txt" "DF11.txt" "DF12.txt"
[13] "DF13.txt" "DF14.txt" "DF15.txt" "DF16.txt" "DF17.txt" "DF18.txt"
[19] "DF19.txt" "DF20.txt" "DF21.txt" "DF22.txt" "DF23.txt" "DF24.txt"
[25] "DF25.txt" "DF26.txt" "DF27.txt" "DF28.txt" "DF29.txt" "DF30.txt"
[31] "DF31.txt" "DF32.txt" "DF33.txt" "DF34.txt" "DF35.txt" "DF36.txt"
[37] "DF37.txt" "DF38.txt" "DF39.txt" "DF40.txt" "DF41.txt" "DF42.txt"
[43] "DF43.txt" "DF44.txt" "DF45.txt" "DF46.txt" "DF47.txt" "DF48.txt"
[49] "DF49.txt" "DF50.txt" "DF51.txt" "DF52.txt" "DF53.txt" "DF54.txt"
[55] "DF55.txt" "DF56.txt" "DF57.txt" "DF58.txt" "DF59.txt" "DF60.txt"
[61] "DF61.txt" "DF62.txt" "DF63.txt" "DF64.txt" "DF65.txt" "DF66.txt"
[67] "DF67.txt" "DF68.txt" "DF69.txt" "DF70.txt" "DF71.txt" "DF72.txt"
[73] "DF73.txt" "DF74.txt" "DF75.txt" "DF76.txt" "DF77.txt" "DF78.txt"
[79] "DF79.txt" "DF80.txt" "DF81.txt" "DF82.txt" "DF83.txt" "DF84.txt"
[85] "DF85.txt" "DF86.txt" "DF87.txt" "DF88.txt" "DF89.txt" "DF90.txt"
[91] "DF91.txt" "DF92.txt" "DF93.txt" "DF94.txt" "DF95.txt" "DF96.txt"
[97] "DF97.txt" "DF98.txt" "DF99.txt" "DF100.txt"
答案 2 :(得分:5)
或者如果你想留在非异教徒的边界内,你可以使用原始的正则表达式。
> x <- paste("file", 1:20, ".txt", sep = "")
> sort(x)
[1] "file1.txt" "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt"
[10] "file18.txt" "file19.txt" "file2.txt" "file20.txt" "file3.txt" "file4.txt" "file5.txt" "file6.txt" "file7.txt"
[19] "file8.txt" "file9.txt"
> num.sort <- as.numeric(gsub("[^\\d]+", "\\1", x, perl = TRUE))
> x[sort(num.sort)]
[1] "file1.txt" "file2.txt" "file3.txt" "file4.txt" "file5.txt" "file6.txt" "file7.txt" "file8.txt" "file9.txt"
[10] "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt" "file18.txt"
[19] "file19.txt" "file20.txt"
答案 3 :(得分:0)
这是自然排序与字母排序问题。对我来说,名为naturalsort的软件包最适合以人类可读的方式显示文件名。
context.Upsert(new Role { Name = "Employee", NormalizedName = "employee" })
.On(r => new { r.Name })
.Run();