En_US和en_US之间的区别?

时间:2012-12-30 05:03:38

标签: linux sorting lang

今天我发现了一个用linux sort命令排序文件的问题。当我设置env LANG = En_US时,结果就是我所期望的。但是当LANG = en_US时,结果很奇怪。 我运行的一些命令和输出如下:

[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ cat dd.dat                 
23 340_guard    16                                                                                                        
23 340_guard    17                                                                                                        
23 340_guard    18                                                                                                        
23 360_guard... 16                                                                                                      
23 360_guard    16                                                                                                        
23 360_guard... 17                                                                                                      
23 360_guard... 18              

[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ LANG=En_US sort dd.dat     
23 340_guard    16                                                                                                        
23 340_guard    17                                                                                                        
23 340_guard    18                                                                                                        
23 360_guard    16                                                                                                        
23 360_guard... 16                                                                                                      
23 360_guard... 17                                                                                                      
23 360_guard... 18                                 

[work@xx:/data1/muce_temp/datamarts/reduce_result_file/302/1d/201212260000]$ LANG=en_US sort dd.dat     
23 340_guard    16                                                                                                        
23 340_guard    17                                                                                                        
23 340_guard    18                                                                                                        
23 360_guard... 16                                                                                                      
23 360_guard    16          (why this line appear here ? )                                                                                      
23 360_guard... 17                                                                                                      
23 360_guard... 18      

此文件中行的格式详细信息如下:

2^E3^F360_guard^E...^I16^Ee^E17/18^I63776769$
2^E3^F360_guard^E^I16^Ee^E17/18^I63776769$
2^E3^F360_guard^E...^I17^Ei^E0^I63776771$
2^E3^F360_guard^E...^I18^Ei^E1^I63776773$

^ E是'\ x05',^ F是'\ x06',^我是标签,$是'\ n'。

提前致谢。

2 个答案:

答案 0 :(得分:0)

en_US调用更智能的排序算法,忽略那些在排序时通常会被忽略的点串。它显然区分大小写,因此En_US正在回归默认语言(可能是C)。

答案 1 :(得分:0)

“en_US”是“Language = English,locale = United States”的“正确”值。其他语言包括“en_GB”(英国),“en_CA”(加拿大)和en_AU(澳大利亚):

我得到了这些结果:

echo $LANG;sort tmp.txt
en_US.UTF-8
23 340_guard    16
23 340_guard    17
23 340_guard    18
23 360_guard    16
23 360_guard... 16
23 360_guard... 17
23 360_guard... 18

export LANG=en_US;echo $LANG;sort tmp.txt
en_US
23 340_guard    16
23 340_guard    17
23 340_guard    18
23 360_guard    16
23 360_guard... 16
23 360_guard... 17
23 360_guard... 18

export LANG=En_US;echo $LANG;sort tmp.txt
En_US
23 340_guard    16
23 340_guard    17
23 340_guard    18
23 360_guard    16
23 360_guard... 16
23 360_guard... 17
23 360_guard... 18

 export LANG=abc-silly;echo $LANG;sort tmp.txt
abc-silly
23 340_guard    16
23 340_guard    17
23 340_guard    18
23 360_guard    16
23 360_guard... 16
23 360_guard... 17
23 360_guard... 18