用toUpperCase更改为.length?

时间:2018-04-18 09:17:05

标签: javascript string substring string-length

我在jQuery插件中找到了这一行:

node.data.substr(0, pos).toUpperCase().length - node.data.substr(0, pos).length

尽管我可以说,这应该总是为零,因为唯一的区别是toUpperCase(),它不应该改变字符串的长度。

这里发生了什么?

2 个答案:

答案 0 :(得分:1)

某些Unicode字符(尤其是连字符)在转换为大写时会出现问题,因为它们没有相应的字符,而是可能会转换为2个字符。

代码示例:

var lowerChar = '\uFB00';
console.log("lowercase: ", lowerChar, "length: ", lowerChar.length);
var upperChar = lowerChar.toUpperCase();
console.log("uppercase: ", upperChar, "length: ", upperChar.length);

答案 1 :(得分:0)

信不信由你,某些字符在转换为大写字母时会变成多个字符。

例如,有一组看起来像两个字母的特殊字符。其中之一是Latin Small Ligature FI,它是由于一种叫做“连字”的印刷字样而存在的。 “ FF”,“ FL”,“ FFI”和“ FFL”也有特殊字符。它们将从.toUppercase()变成2或3个字符。

另一个示例是德语字母eszett。尽管an uppercase version exists于2017年获得德国拼写法理事会的正式批准,但所有浏览器仍会将单个字母“ß”转换为两个字母“ SS”。

测试程序:

function countUpperDiff(input) {
  console.log((input.toUpperCase().length - input.length) + ": " + input + " → " + input.toUpperCase());
}

console.log("This program prints how many characters longer the uppercase version becomes, followed by a before-and-after view.");

console.log("Some typographic ligatures exist in lowercase, but not uppercase.")
countUpperDiff("fix");
countUpperDiff("fix");
countUpperDiff("fly");
countUpperDiff("fly");
countUpperDiff("off");
countUpperDiff("off");
countUpperDiff("affix");
countUpperDiff("affix");
countUpperDiff("raffle");
countUpperDiff("raffle");

console.log("\nThe German letter ß is NOT converted into the capital letter ẞ, but into SS. This all-caps conversion is a problem for some words:")
countUpperDiff("in massen");
countUpperDiff("in maßen");
console.log("The first one means \"in massive amounts\", while the second one means \"in moderate amounts\".")

输出:

This program prints how many characters longer the uppercase version becomes, followed by a before-and-after view.
Some typographic ligatures exist in lowercase, but not uppercase.
0: fix → FIX
1: fix → FIX
0: fly → FLY
1: fly → FLY
0: off → OFF
1: off → OFF
0: affix → AFFIX
2: affix → AFFIX
0: raffle → RAFFLE
2: raffle → RAFFLE

The German letter ß is NOT converted into the capital letter ẞ, but into SS. This is a problem for some words:
0: in massen → IN MASSEN
1: in maßen → IN MASSEN
The first one means "in massive amounts", while the second one means "in moderate amounts".