bufio.Reader和bufio.Scanner功能和性能

时间:2017-11-22 19:31:52

标签: go

我在互联网上看到了几个问题,这些问题已经松散地讨论了为什么人们应该使用bufio.Scanner而不是bufio.Reader。

我不知道我的测试用例是否相关,但在从文本文件中读取1,000,000行时,我决​​定测试一对另一对:

package main

import (
    "fmt"
    "strconv"
    "bufio"
    "time"
    "os"
    //"bytes"
)

func main() {

    fileName := "testfile.txt"

    // Create 1,000,000 integers as strings
    numItems := 1000000
    startInitStringArray := time.Now()

    var input [1000000]string
    //var input []string

    for i:=0; i < numItems; i++ {
        input[i] = strconv.Itoa(i)
        //input = append(input,strconv.Itoa(i))
    }

    elapsedInitStringArray := time.Since(startInitStringArray)
    fmt.Printf("Took %s to populate string array.\n", elapsedInitStringArray)

    // Write to a file
    fo, _ := os.Create(fileName)
    for i:=0; i < numItems; i++ {
        fo.WriteString(input[i] + "\n")
    }

    fo.Close()

    // Use reader
    openedFile, _ := os.Open(fileName)

    startReader := time.Now()
    reader := bufio.NewReader(openedFile)

    for i:=0; i < numItems; i++ {
        reader.ReadLine()
    }
    elapsedReader := time.Since(startReader)
    fmt.Printf("Took %s to read file using reader.\n", elapsedReader)
    openedFile.Close()

    // Use scanner
    openedFile, _ = os.Open(fileName)

    startScanner := time.Now()
    scanner := bufio.NewScanner(openedFile)

    for i:=0; i < numItems; i++ {
        scanner.Scan()
        scanner.Text()
    }

    elapsedScanner := time.Since(startScanner)
    fmt.Printf("Took %s to read file using scanner.\n", elapsedScanner)
    openedFile.Close()
}

我在时间上收到的相当平均的输出看起来像这样:

Took 44.1165ms to populate string array.
Took 17.0465ms to read file using reader.
Took 23.0613ms to read file using scanner.

我很好奇,何时使用阅读器和扫描仪更好,是基于性能还是功能?

1 个答案:

答案 0 :(得分:4)

这是一个有缺陷的基准。他们没有做同样的事情。

func (b *Reader) ReadLine() (line []byte, isPrefix bool, err error)

返回[]byte

func (s *Scanner) Text() string

返回string([]byte)

为了具有可比性,请使用

func (s *Scanner) Bytes() []byte

这是一个有缺陷的基准。它读取短字符串,从“0\n”到“999999\n”的整数。真实世界的数据集是什么样的?

在现实世界中,我们读到了莎士比亚:http://www.gutenberg.org/ebooks/100:纯文本UTF-8:pg100.txt

Took 2.973307ms to read file using reader.   size: 5340315 lines: 124787
Took 2.940388ms to read file using scanner.  size: 5340315 lines: 124787