在NSScanner(Swift)中将scanLocation从utf16单位转换为字符索引

时间:2016-07-28 00:32:33

标签: swift string unicode swift3 nsscanner

在Swift 3.0(NS)扫描程序中,string属性返回正在解析的字符串,scanLocation返回当前扫描位置。我试图提取已解析的文本:

var parsedText: String {
    return string.substring(to: string.index(string.startIndex, offsetBy: scanLocation))
}

string包含多字节字符时,此代码崩溃。事实证明,scanLocation返回utf16单位的数量,而不是解析的字符数。

如何将scanLocation(代码单位)转换为字符索引?

试验的游乐场:

let scanner = Scanner(string: "Hello")
scanner.scanString("Hello", into: nil)
print(scanner.scanLocation) // Returns 7 instead of 6

1 个答案:

答案 0 :(得分:1)

获取角色指数:

import Foundation

extension Scanner {
    var scanLocationInCharacters: Int {
        let utf16 = string.utf16
        guard let to16 = utf16.index(utf16.startIndex, offsetBy: scanLocation, limitedBy: utf16.endIndex),
            let to = String.Index(to16, within: string) else {
                return 0
        }
        return string.distance(from: string.startIndex, to: to)
    }
}

let scanner = Scanner(string: "Hello")
scanner.scanString("Hello", into: nil)

print(scanner.scanLocation) // 7
print(scanner.scanLocationInCharacters) // 6

检索已解析的文本:

var parsedText: String {
    let utf16 = string.utf16
    guard let to16 = utf16.index(utf16.startIndex, offsetBy: scanLocation, limitedBy: utf16.endIndex),
        let to = String.Index(to16, within: string) else {
            return ""
    }
    return string.substring(to: to)
}

奖励:报告错误时,您可能还想打印当前行和列:

var currentLine: Int {
    var lineCount = 1
    for character in parsedText.characters {
        if character == "\n" { lineCount += 1 }
    }
    return lineCount
}

var currentColumn: Int {
    let text = parsedText
    if let range = text.range(of: "\n", options: .backwards) {
        return text.distance(from: range.upperBound, to: text.endIndex) + 1
    }
    return parsedText.characters.count + 1
}