我在jQuery插件中找到了这一行:
node.data.substr(0, pos).toUpperCase().length - node.data.substr(0, pos).length
尽管我可以说,这应该总是为零,因为唯一的区别是toUpperCase()
,它不应该改变字符串的长度。
这里发生了什么?
答案 0 :(得分:1)
某些Unicode字符(尤其是连字符)在转换为大写时会出现问题,因为它们没有相应的字符,而是可能会转换为2个字符。
代码示例:
var lowerChar = '\uFB00';
console.log("lowercase: ", lowerChar, "length: ", lowerChar.length);
var upperChar = lowerChar.toUpperCase();
console.log("uppercase: ", upperChar, "length: ", upperChar.length);
答案 1 :(得分:0)
信不信由你,某些字符在转换为大写字母时会变成多个字符。
例如,有一组看起来像两个字母的特殊字符。其中之一是Latin Small Ligature FI,它是由于一种叫做“连字”的印刷字样而存在的。 “ FF”,“ FL”,“ FFI”和“ FFL”也有特殊字符。它们将从.toUppercase()
变成2或3个字符。
另一个示例是德语字母eszett。尽管an uppercase version exists于2017年获得德国拼写法理事会的正式批准,但所有浏览器仍会将单个字母“ß”转换为两个字母“ SS”。
测试程序:
function countUpperDiff(input) {
console.log((input.toUpperCase().length - input.length) + ": " + input + " → " + input.toUpperCase());
}
console.log("This program prints how many characters longer the uppercase version becomes, followed by a before-and-after view.");
console.log("Some typographic ligatures exist in lowercase, but not uppercase.")
countUpperDiff("fix");
countUpperDiff("fix");
countUpperDiff("fly");
countUpperDiff("fly");
countUpperDiff("off");
countUpperDiff("off");
countUpperDiff("affix");
countUpperDiff("affix");
countUpperDiff("raffle");
countUpperDiff("raffle");
console.log("\nThe German letter ß is NOT converted into the capital letter ẞ, but into SS. This all-caps conversion is a problem for some words:")
countUpperDiff("in massen");
countUpperDiff("in maßen");
console.log("The first one means \"in massive amounts\", while the second one means \"in moderate amounts\".")
输出:
This program prints how many characters longer the uppercase version becomes, followed by a before-and-after view.
Some typographic ligatures exist in lowercase, but not uppercase.
0: fix → FIX
1: fix → FIX
0: fly → FLY
1: fly → FLY
0: off → OFF
1: off → OFF
0: affix → AFFIX
2: affix → AFFIX
0: raffle → RAFFLE
2: raffle → RAFFLE
The German letter ß is NOT converted into the capital letter ẞ, but into SS. This is a problem for some words:
0: in massen → IN MASSEN
1: in maßen → IN MASSEN
The first one means "in massive amounts", while the second one means "in moderate amounts".