Question

如果我错了，请纠正我：

我认为

read是有效的：

a）read一次将整个文件内容提取到内存中，类似于python。

b）readline和readlines一次将一行存储到内存中。

Answer 1

为了扩展评论，这里有一些示例基准测试（以向您进一步展示如何自己执行此类测试）。

首先创建一些随机测试数据：

open("testdata.txt", "w") do f
    for i in 1:10^6
        println(f, "a"^100)
    end
end

我们将要以四种方式读取数据（并计算行的总长度）：

f1() = sum(length(l) for l in readlines("testdata.txt"))

f2() = sum(length(l) for l in eachline("testdata.txt"))

function f3()
    s = 0
    open("testdata.txt") do f
        while !eof(f)
            s += length(readline(f))
        end
    end
    s
end

function f4()
    s = 0
    for c in read("testdata.txt", String)
        s += c != '\n' # assume Linux for simplicity
    end
    s
end

现在，我们比较给定选项的性能和内存使用情况：

julia> using BenchmarkTools

julia> @btime f1()
  239.857 ms (2001558 allocations: 146.59 MiB)
100000000

julia> @btime f2()
  179.480 ms (2001539 allocations: 137.59 MiB)
100000000

julia> @btime f3()
  189.643 ms (2001533 allocations: 137.59 MiB)
100000000

julia> @btime f4()
  158.055 ms (13 allocations: 96.32 MiB)
100000000

如果在计算机上运行它，您应该会得到类似的结果。

关于julia中的文件io，哪种read（）或readline（）或readlines（）更快？

1 个答案: