这是代码
package main
import (
"fmt"
"os"
"bufio"
)
func main() {
fi, err := os.Open("/tmp/messages")
if err != nil {
panic(err)
}
defer func() {
if err := fi.Close(); err != nil {
panic(err)
}
}()
scanner := bufio.NewScanner(fi)
lines := 0
for scanner.Scan() {
scanner.Bytes()
lines += 1
}
fmt.Println(lines)
if err := scanner.Err(); err != nil {
panic(err)
}
}
这是经过的时间
5675143
real 0m0.548s
user 0m0.437s
sys 0m0.119s
" wc"花费了两倍的时间。对于588Mb的日志文件
wc -l / tmp / messages
5675142 /tmp/messages
real 0m0.231s
user 0m0.150s
sys 0m0.081s
pprof的结果是
540ms of 540ms total ( 100%)
flat flat% sum% cum cum%
200ms 37.04% 37.04% 200ms 37.04% syscall.Syscall
190ms 35.19% 72.22% 190ms 35.19% runtime.indexbytebody
100ms 18.52% 90.74% 520ms 96.30% bufio.(*Scanner).Scan
20ms 3.70% 94.44% 210ms 38.89% bufio.ScanLines
10ms 1.85% 96.30% 530ms 98.15% main.main
10ms 1.85% 98.15% 210ms 38.89% os.(*File).Read
10ms 1.85% 100% 10ms 1.85% runtime.usleep
0 0% 100% 200ms 37.04% os.(*File).read
0 0% 100% 530ms 98.15% runtime.goexit
0 0% 100% 530ms 98.15% runtime.main
0 0% 100% 10ms 1.85% runtime.mstart
0 0% 100% 10ms 1.85% runtime.mstart1
0 0% 100% 10ms 1.85% runtime.sysmon
0 0% 100% 200ms 37.04% syscall.Read
0 0% 100% 200ms 37.04% syscall.read
有没有办法优化代码以更快地运行?我的猜测是调用系统调用对总时间贡献100毫秒的开销。大部分时间可能花在Scanner内部的数据复制上。