在R中排序文件:如何使结果保持一致?

时间:2015-08-29 22:28:00

标签: r file sorting

使用list.files时,我在不同的计算机上获得不同的排序结果。我怎样才能做到这一点我总是得到第二种行为,它按照人类的方式对填充的数字进行排序?

计算机1(Debian)

$ uname -a
Linux work 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt9-2 (2015-04-13) x86_64 GNU/Linux
$ touch 02 10 _2
$ R -e "list.files()"

R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> list.files()
[1] "02" "10" "_2"

计算机2(SUSE)

> uname -a
Linux efrc3 3.16.7-21-desktop #1 SMP PREEMPT Tue Apr 14 07:11:37 UTC 2015 (93c1539) x86_64 x86_64 x86_64 GNU/Linux
tug04419@efrc3:~/temp> touch 02 10 _2
tug04419@efrc3:~/temp> R -e "list.files()"
R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-suse-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> list.files()
[1] "_2" "02" "10"

2 个答案:

答案 0 :(得分:3)

我有一些不同的linux& OS X系统和一些系统具有各种语言环境设置,而其他系统具有或不具备R中的ICU功能。为了获得一致性(我将在此处显示两个系统),我必须禁用特定于语言环境的排序规则{{1 }}:

Sys.setlocale("LC_COLLATE", "C")

' # osx01 Rscript -e 'system("touch 02 10 _2") ; list.files() ; Sys.setlocale("LC_COLLATE", "C") ; list.files()' [1] "02" "10" "_2" [1] "C" [1] "02" "10" "_2" # linux02 Rscript -e 'system("touch 02 10 _2") ; list.files() ; Sys.setlocale("LC_COLLATE", "C") ; list.files()' [1] "_2" "02" "10" [1] "C" [1] "02" "10" "_2" '在输出中只有[1] "C"个喷出的东西,而不是Sys.setlocale()的结果。

您还可以在系统范围内设置list.files()或在调用LC_COLLATERscript

的shell脚本中设置

答案 1 :(得分:1)

看起来它取决于系统。在doc中说

  

文件按字母顺序排序,如果是完整路径   full.names = TRUE。

     

list.dirs隐式包含all.files = TRUE,如果recursive = TRUE,   答案包括路径本身(前提是它是一个可读目录)。   注意

     

文件命名约定取决于平台。模式匹配   适用于操作系统返回的文件名大小。