我有一个字符串
$string= 'AbCdEf';
我希望使用tr函数将所有大写字母转换为小写字母,将所有小写字母转换为大写字母....同时。我基本上只是想扭转它成为。
aBcDeF
我提出了这条线,但我不确定如何修改它来做我想要的。有什么帮助吗?
$string=~ tr/A-Z/a-z/;
谢谢!
答案 0 :(得分:14)
根据Tom的要求,Unicode-clean(或locales-clean)版本:
s/([[:upper:]])|([[:lower:]])/defined $1 ? lc $1 : uc $2/eg
答案 1 :(得分:11)
$string =~ tr/A-Za-z/a-zA-Z/;
答案 2 :(得分:7)
您可以通过以下方式执行完整的Unicode解决方案:
s/ (\p{CWU}) | (\p{CWL}) /defined $1 ? uc $1 : lc $2/gex;
或者这样
s/ (\p{CWL}) | (\p{CWU}) /defined $1 ? lc $1 : uc $2/gex;
取决于您想要在两个方向上改变大小写的内容,例如Dz,其大写字母为DZ,小写字母为dz。
如果你在这个输入中运行这两个替换中的第二个:
@ 0040 COMMERCIAL AT © 00A9 COPYRIGHT SIGN Å 212B ANGSTROM SIGN ⒜ 249C PARENTHESIZED LATIN SMALL LETTER A Ⓐ 24B6 CIRCLED LATIN CAPITAL LETTER A ⓐ 24D0 CIRCLED LATIN SMALL LETTER A A FF21 FULLWIDTH LATIN CAPITAL LETTER A a FF41 FULLWIDTH LATIN SMALL LETTER A Ⓒ 24B8 CIRCLED LATIN CAPITAL LETTER C ⓒ 24D2 CIRCLED LATIN SMALL LETTER C DZ 01F1 LATIN CAPITAL LETTER DZ Dz 01F2 LATIN CAPITAL LETTER D WITH SMALL LETTER Z dz 01F3 LATIN SMALL LETTER DZ ⅲ 2172 SMALL ROMAN NUMERAL THREE S 0053 LATIN CAPITAL LETTER S s 0073 LATIN SMALL LETTER S ſ 017F LATIN SMALL LETTER LONG S ⒮ 24AE PARENTHESIZED LATIN SMALL LETTER S Ⓢ 24C8 CIRCLED LATIN CAPITAL LETTER S ⓢ 24E2 CIRCLED LATIN SMALL LETTER S Ꞅ A784 LATIN CAPITAL LETTER INSULAR S ꞅ A785 LATIN SMALL LETTER INSULAR S ß 00DF LATIN SMALL LETTER SHARP S ẞ 1E9E LATIN CAPITAL LETTER SHARP S Ⅶ 2166 ROMAN NUMERAL SEVEN ⅻ 217B SMALL ROMAN NUMERAL TWELVE
它产生了这些结果:
@ 0040 commercial at © 00a9 copyright sign å 212b angstrom sign ⒜ 249c parenthesized latin small letter a ⓐ 24b6 circled latin capital letter a Ⓐ 24d0 circled latin small letter a a ff21 fullwidth latin capital letter a A ff41 fullwidth latin small letter a ⓒ 24b8 circled latin capital letter c Ⓒ 24d2 circled latin small letter c dz 01f1 latin capital letter dz dz 01f2 latin capital letter d with small letter z DZ 01f3 latin small letter dz Ⅲ 2172 small roman numeral three s 0053 latin capital letter s S 0073 latin small letter s S 017f latin small letter long s ⒮ 24ae parenthesized latin small letter s ⓢ 24c8 circled latin capital letter s Ⓢ 24e2 circled latin small letter s ꞅ a784 latin capital letter insular s Ꞅ a785 latin small letter insular s SS 00df latin small letter sharp s ß 1e9e latin capital letter sharp s ⅶ 2166 roman numeral seven Ⅻ 217b small roman numeral twelve
使用第一个函数的唯一不同的部分(在该集合中)将是dz序列看起来像这样:
dz 01f1 latin capital letter dz DZ 01f2 latin capital letter d with small letter z DZ 01f3 latin small letter dz
您不想仅使用较高或较低的测试的原因是因为您执行了不必要的工作,因为有大量的套接字代码点在casemapped时不会更改大小写。例如,所有这些都是套管代码点,但在大写时和小写时都不会改变:
ª 00AA FEMININE ORDINAL INDICATOR ᴬ 1D2C MODIFIER LETTER CAPITAL A ᴀ 1D00 LATIN LETTER SMALL CAPITAL A ℂ 2102 DOUBLE-STRUCK CAPITAL C ᴰ 1D30 MODIFIER LETTER CAPITAL D ʣ 02A3 LATIN SMALL LETTER DZ DIGRAPH ʤ 02A4 LATIN SMALL LETTER DEZH DIGRAPH ℇ 2107 EULER CONSTANT ɘ 0258 LATIN SMALL LETTER REVERSED E ɞ 025E LATIN SMALL LETTER CLOSED REVERSED OPEN E ℊ 210A SCRIPT SMALL G ɡ 0261 LATIN SMALL LETTER SCRIPT G ɢ 0262 LATIN LETTER SMALL CAPITAL G ʰ 02B0 MODIFIER LETTER SMALL H ℋ 210B SCRIPT CAPITAL H ℎ 210E PLANCK CONSTANT ℹ 2139 INFORMATION SOURCE ʲ 02B2 MODIFIER LETTER SMALL J ℳ 2133 SCRIPT CAPITAL M º 00BA MASCULINE ORDINAL INDICATOR ɸ 0278 LATIN SMALL LETTER PHI ĸ 0138 LATIN SMALL LETTER KRA ʏ 028F LATIN LETTER SMALL CAPITAL Y ℼ 213C DOUBLE-STRUCK SMALL PI
所以你会发现它们是大写或小写,然后调用逆映射函数,然后发现没有任何改变。我想,为什么要这么麻烦?