说我想用Swift处理一个大文档。整个文件不适合内存,因此我们必须使用流式传输。在Swift中,我们可以访问NSInputStream
,但是一旦输入被转移到Swift字符串并被消耗,性能就会迅速恶化。假设我想读取一个缓冲区(1024字节),并从那些创建Swift字符串,然后遍历字符串以获取单个字符。我将逐步显示性能下降,因为会添加额外的语句。
var stream = NSInputStream(URL: input)!
stream.open()
var buffer = [UInt8](count: 1024, repeatedValue: 0)
仅以~200 MB / s读取流进程:
while stream.hasBytesAvailable {
let readSize = stream.read(&buffer, maxLength: buffer.count)
}
以~130 MB / s的速度从缓冲区创建NSString:
while stream.hasBytesAvailable {
let readSize = stream.read(&buffer, maxLength: buffer.count)
let nsString = NSString(bytes: &buffer, length: readSize, encoding: NSUTF8StringEncoding)
}
以~90 MB / s的速度从NSString转换为String:
while stream.hasBytesAvailable {
let readSize = stream.read(&buffer, maxLength: buffer.count)
let nsString = NSString(bytes: &buffer, length: readSize, encoding: NSUTF8StringEncoding)
let swiftString = nsString as String
}
以~1 MB / s的速度迭代字符串:
while stream.hasBytesAvailable {
let readSize = stream.read(&buffer, maxLength: buffer.count)
let nsString = NSString(bytes: &buffer, length: readSize, encoding: NSUTF8StringEncoding)
let swiftString = nsString as String
for character in swiftString { }
}
为什么这段代码运行缓慢?现在我明白,在当前的事态中,Objective C和Swift的混合组合正在进行很多(隐式)转换。然而,最大的性能影响似乎是在Swift String本身上重复。
请注意,代码是使用编译标志Fastest [-O]运行的。