OS X上的Bash语言环境敏感通配

时间:2015-08-04 16:30:29

标签: macos bash locale

Linux(CentOS)和OS X(约塞米蒂)上的场景:

$ touch A B C X Y Z a b c x y z

$ locale
LC_ALL=en_GB.UTF-8

在OS X上我使用区分大小写的文件系统。

Linux上的Bash 4.1.2:

$ echo [A-Z]
A b B c C x X y Y z Z

这是预期的输出,遵循此区域设置的LC_COLLATE

OS X上的Bash 4.3.39:

$ echo [A-Z]
A B C X Y Z

这似乎与区域设置ANSI,C或POSIX相同。因此,在OS X上看起来像是通过globbing忽略了语言环境。

为什么不一致?有没有办法在OS X上获得区域设置敏感结果?

修改: OS X:LC_ALL使用:{/ p>明确设置.bash_profile

export LC_ALL=en_GB.UTF-8
OS X上的

语言环境:

bash-4.3$ locale
LANG="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_CTYPE="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_ALL="en_GB.UTF-8"
Linux上的

语言环境:

$ locale
LANG=en_US.utf8
LC_CTYPE="en_GB.utf8"
LC_NUMERIC="en_GB.utf8"
LC_TIME="en_GB.utf8"
LC_COLLATE="en_GB.utf8"
LC_MONETARY="en_GB.utf8"
LC_MESSAGES="en_GB.utf8"
LC_PAPER="en_GB.utf8"
LC_NAME="en_GB.utf8"
LC_ADDRESS="en_GB.utf8"
LC_TELEPHONE="en_GB.utf8"
LC_MEASUREMENT="en_GB.utf8"
LC_IDENTIFICATION="en_GB.utf8"
LC_ALL=en_GB.utf8

修改

文件系统的格式是Mac OS Extended(区分大小写,日记)。

@chepner建议我在OS X上没有单独的文件。这里有两件事,inode是不同的,而glob构造?给出了大写和小写的所有文件:

bash-4.3$ ls -li ?
559 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 A
560 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 B
561 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 C
562 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 X
563 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 Y
564 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 Z
565 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 a
566 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 b
567 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 c
568 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 x
569 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 y
570 -rw-r--r--  1 clivedarke  staff  0 Aug  4 17:23 z

此外:

bash-4.3$ echo 'lower' > a
bash-4.3$ echo 'upper' > A
bash-4.3$ cat a
lower
bash-4.3$ cat A
upper

bash-4.3$ diff a A
1c1
< lower
---
> upper

1 个答案:

答案 0 :(得分:2)

挖掘我在bash-4.3/lib/glob/smatch.c中找到的源代码以下评论:

/* We use strcoll(3) for range comparisons in bracket expressions,
   even though it can have unwanted side effects in locales
   other than POSIX or US.  For instance, in the de locale, [A-Z] matches
   all characters.  If GLOB_ASCIIRANGE is non-zero, and we're not forcing
   the use of strcoll (e.g., for explicit collating symbols), we use
   straight ordering as if in the C locale. */

configure我找到了:

--enable-glob-asciiranges-default
                      force bracket range expressions in pattern matching
                      to use the C locale by default

所以我猜这是在编译时设置的。需要更多的挖掘。