内存映射缓冲区的增量分配和1 GB大小的故障

时间:2015-12-04 20:46:09

标签: scala jvm memory-mapped-files

我创建了以下演示来查看MMF(我希望将其用作非常大的长值数组)。

import java.nio._, java.io._, java.nio.channels.FileChannel

object Index extends App {

    val formatter = java.text.NumberFormat.getIntegerInstance
    def format(l: Number) = formatter.format(l)

    val raf = new RandomAccessFile("""C:\Users\...\Temp\96837624\mmf""", "rw")
    raf.setLength(20)
    def newBuf(capacity: Int) = {
      var bytes= 8.toLong*capacity
      println("new buf " + format(capacity) + " words = " + format(bytes) + " bytes")

      // java.io.IOException: "Map failed" at the following line
      raf.getChannel.map(FileChannel.MapMode.READ_WRITE, 0, bytes).asLongBuffer()
    }

    (1 to 100 * 1000 * 1000).foldLeft(newBuf(2): LongBuffer){ case(buf, i) =>
        if (Math.random < 0.000009) println(format(buf.get(buf.position()/2)))
        (if (buf.position == buf.capacity) {
            val p = buf.position
            val b = newBuf(buf.capacity * 2)
            b.position(p) ; b
        } else buf).put(i)

    }

    raf.close

输出失败

16,692,145
16,741,940
new buf 67,108,864
[error] (run-main-1) java.io.IOException: Map failed
java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907)

我看到创建了一个512 MB的文件,系统似乎无法将其扩展为1 GB。

但是,如果不使用2个长字foldLeft(newBuf(2))的初始大小,而是使用64M长字newBuf(64*1024*1027),则运行时成功创建1GB文件,并在尝试使用<创建2GB文件时失败/ p>

new buf 268 435 458 words = 2 147 483 664 bytes
java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
        at sun.nio.ch.FileChannelImpl.map(Unknown Source)

我使用64位jvm运行它。

我也不确定如何关闭缓冲区以释放它以供以后在sbt中应用,并确保数据最终会出现在文件中。该机制看起来完全不可靠。

1 个答案:

答案 0 :(得分:0)

好的,有一天的实验证明,无论如何,32位JVM都会以1 GB的IOException: Map failed失败。为了绕过64位机器上的映射Size exceeds Integer.MAX_VALUE,应该使用多个价格合理的缓冲区,例如:每个100 MB都很好。那是因为buffers are addressed by integer

关于这个问题,你可以同时保持所有这些缓冲区在内存中打开,即不需要关闭一个缓冲区=&gt;在分配下一个有效增加文件大小之前为null,如下面的演示演示

import Utils._, java.nio._, java.io._, java.nio.channels.FileChannel

object MmfDemo extends App {

    val bufAddrWidth = 25 /*in bits*/ // Every element of the buff addresses a long
    val BUF_SIZE_WORDS = 1 << bufAddrWidth ; val BUF_SIZE_BYTES = BUF_SIZE_WORDS << 3
    val bufBitMask = BUF_SIZE_WORDS - 1
    var buffers = Vector[LongBuffer]()
    var capacity = 0 ; var pos = 0
    def select(pos: Int) = {
        val bufn = pos >> bufAddrWidth // higher bits of address denote the buffer number
        //println(s"accessing $pos = " + (pos - buf * wordsPerBuf) + " in " + buf)
        while (buffers.length <= bufn) expand
        pass(buffers(bufn)){_.position(pos & bufBitMask)}
    }
    def get(address: Int = pos) = {
        pos = address +1
        select(address).get
    }
    def put(value: Long) {
        //println("writing " + value + " to " + pos)
        select(pos).put(value) ; pos += 1
    }
    def expand = {
        val fromByte = buffers.length.toLong  * BUF_SIZE_BYTES
        println("adding " + buffers.length + "th buffer, total size expected " + format(fromByte + BUF_SIZE_BYTES) + " bytes")

        // 32bit JVM: java.io.IOException: "Map failed" at the following line if buf size requested is larger than 512 mb
        // 64bit JVM: IllegalArgumentException: Size exceeds Integer.MAX_VALUE
        buffers :+= fc.map(FileChannel.MapMode.READ_WRITE, fromByte, BUF_SIZE_BYTES).asLongBuffer()
        capacity += BUF_SIZE_WORDS
    }

    def rdAll(get: Int => Long) {
        var firstMismatch = -1
        val failures = (0 until parse(args(1))).foldLeft(0) { case(failures, i) =>
            val got = get(i)
            if (got != i && firstMismatch == -1) {firstMismatch = i; println("first mismatch at " +format(i) + ", value = " + format(got))}
            failures + ?(got != i, 1, 0)
        } ; println(format(failures) + " mismatches")
    }

    val raf = new RandomAccessFile("""C:\Temp\mmf""", "rw")
    val fc = raf.getChannel
    try {

        if (args.length < 1) {
            println ("usage1: buf_gen <len in long words>")
            println ("usage1: raf_gen <len in long words>")
            println("example: buf_gen 30m")
            println("usage2: raf_rd <size in words>")
            println("usage3: buf_rd <size in words>")
        } else {
            val t1 = System.currentTimeMillis
            args(0) match {
                case "buf_gen" => raf.setLength(0)
                    (0 until parse(args(1))) foreach {i => put(i.toLong)}
                case "raf_gen" => raf.setLength(0)
                    (0 until parse(args(1))) foreach {i =>raf.writeLong(i.toLong)}
                        //fc.force(true)
                case "rd_raf" => rdAll{i => raf.seek(i.toLong * 8) ; raf.readLong()}
                case "rd_buf" => rdAll(get)
                case u =>println("unknown command " + u)
            } ; println("finished in " + (System.currentTimeMillis - t1) + " ms")
        }
    } finally {
        raf.close ; fc.close

        buffers = null ; System.gc /*GC needs to close the buffer*/}

}

object Utils {
    val formatter = java.text.NumberFormat.getIntegerInstance
    def format(l: Number) = formatter.format(l)

    def ?[T](sel: Boolean, a: => T, b: => T) = if (sel) a else b
    def parse(s: String) = {
        val lc = s.toLowerCase()
        lc.filter(_.isDigit).toInt *
            ?(lc.contains("k"), 1000, 1) *
            ?(lc.contains("m"), 1000*1000, 1)
    }
    def eqa[T](a: T, b: T) = assert(a == b, s"$a != $b")
    def pass[T](a: T)(code: T => Unit) = {code(a) ; a}
}

至少在Windows中。使用这个程序,我设法创建了比我的机器内存更大的mmf文件(更不用说JVM的-Xmx,它在这些问题上根本没有任何作用)。只需减慢使用鼠标在Windows控制台中选择一些文本的文件生成速度(程序将暂停,直到您释放选择),否则Windows将驱逐所有其他性能关键人员到页面文件,并且您的PC将在颠簸中死亡。

顺便说一下,尽管我只写入文件的末尾,而且Windows可以驱逐我未使用的千兆字节块,但PC仍在崩溃。另外,我注意到我正在写的块实际上是读

以下输出

adding 38th buffer, total size expected 12,480,000,000 bytes
adding 39th buffer, total size expected 12,800,000,000 bytes

附有以下系统请求

5:24,java,"QueryStandardInformationFile",mmf,"SUCCESS","AllocationSize: 12 480 000 000, EndOfFile: 12 480 000 000, NumberOfLinks: 1, DeletePending: False, Directory: False"
5:24,java,"SetEndOfFileInformationFile",mmf,"SUCCESS","EndOfFile: 12 800 000 000"
5:24,java,"SetAllocationInformationFile",mmf,"SUCCESS","AllocationSize: 12 800 000 000"
5:24,java,"CreateFileMapping",mmf,"FILE LOCKED WITH WRITERS","SyncType: SyncTypeCreateSection, PageProtection: "
5:24,java,"QueryStandardInformationFile",mmf,"SUCCESS","AllocationSize: 12 800 000 000, EndOfFile: 12 800 000 000, NumberOfLinks: 1, DeletePending: False, Directory: False"
5:24,java,"CreateFileMapping",mmf,"SUCCESS","SyncType: SyncTypeOther"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 000 000, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 032 768, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 065 536, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 098 304, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 131 072, Length: 20 480, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"

skipped 9000 reads

5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 836 160, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 868 928, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 901 696, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 934 464, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 967 232, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"

但这是另一个故事。

事实证明这个答案是Peter Lawrey's的重复,除了在映射大缓冲区时我的问题专门用于'映射失败'和'超出整数范围'而原始问题与JVM中的OutOfMem有关,与I / O无关。