我试图使用Swift 3 CharacterSet
来过滤String
中的字符,但我很早就陷入困境。 CharacterSet有一个名为contains
func包含(_ member:UnicodeScalar) - >布尔
测试CharacterSet中特定UnicodeScalar的成员资格。
但测试这并不能产生预期的行为。
let characterSet = CharacterSet.capitalizedLetters
let capitalAString = "A"
if let capitalA = capitalAString.unicodeScalars.first {
print("Capital A is \(characterSet.contains(capitalA) ? "" : "not ")in the group of capital letters")
} else {
print("Couldn't get the first element of capitalAString's unicode scalars")
}
我得到了Capital A is not in the group of capital letters
但我却期待相反的事情。
非常感谢。
答案 0 :(得分:7)
CharacterSet.capitalizedLetters
返回一个字符集,其中包含Unicode General Category Lt aka“Letter,titlecase”中的字符。那是
“包含大写字母后跟小写字母的连字符(例如,Dž,Lj,Nj和Dz)”(比较Wikipedia: Unicode character property或
Unicode® Standard Annex #44 – Table 12. General_Category Values)。
您可以在此处找到一个列表:Unicode Characters in the 'Letter, Titlecase' Category。
您也可以使用来自的代码 NSArray from NSCharacterset转储角色的内容 设置:
extension CharacterSet {
func allCharacters() -> [Character] {
var result: [Character] = []
for plane: UInt8 in 0...16 where self.hasMember(inPlane: plane) {
for unicode in UInt32(plane) << 16 ..< UInt32(plane + 1) << 16 {
if let uniChar = UnicodeScalar(unicode), self.contains(uniChar) {
result.append(Character(uniChar))
}
}
}
return result
}
}
let characterSet = CharacterSet.capitalizedLetters
print(characterSet.allCharacters())
// ["Dž", "Lj", "Nj", "Dz", "ᾈ", "ᾉ", "ᾊ", "ᾋ", "ᾌ", "ᾍ", "ᾎ", "ᾏ", "ᾘ", "ᾙ", "ᾚ", "ᾛ", "ᾜ", "ᾝ", "ᾞ", "ᾟ", "ᾨ", "ᾩ", "ᾪ", "ᾫ", "ᾬ", "ᾭ", "ᾮ", "ᾯ", "ᾼ", "ῌ", "ῼ"]
你可能想要的是CharacterSet.uppercaseLetters
返回包含Unicode General Category Lu和Lt。
中字符的字符集