缓冲的RandomAccessFile java

时间:2011-04-10 19:38:16

标签: java file-io io buffering random-access

RandomAccessFile对于随机访问文件非常慢。您经常阅读有关在其上实现缓冲层的信息,但是无法在线查找代码。

所以我的问题是:你们谁知道这个类的任何开源实现共享一个指针或分享你自己的实现?

如果这个问题会成为关于这个问题的有用链接和代码的集合,我很确定,很多人都会共享这个问题并且SUN从来没有正确解决这个问题。

请不要引用MemoryMapping,因为文件可能比Integer.MAX_VALUE大。

6 个答案:

答案 0 :(得分:12)

好吧,即使文件大于Integer.MAX_VALUE,我也没有理由不使用java.nio.MappedByteBuffer。

显然,您不会被允许为整个文件定义单个MappedByteBuffer。但是你可以让几个MappedByteBuffers访问文件的不同区域。

FileChannenel.map中的位置和大小的定义是long类型,这意味着您可以提供超过Integer.MAX_VALUE的值,您唯一需要注意的是缓冲区的大小不会大于Integer.MAX_VALUE。

因此,你可以像这样定义几个地图:

buffer[0] = fileChannel.map(FileChannel.MapMode.READ_WRITE,0,2147483647L);
buffer[1] = fileChannel.map(FileChannel.MapMode.READ_WRITE,2147483647L, Integer.MAX_VALUE);
buffer[2] = fileChannel.map(FileChannel.MapMode.READ_WRITE, 4294967294L, Integer.MAX_VALUE);
...

总之,大小不能大于Integer.MAX_VALUE,但起始位置可以是文件中的任何位置。

在书Java NIO中,作者Ron Hitchens说:

  

通过访问文件   内存映射机制可以远   比阅读或写作更有效率   数据通过常规手段,即使在   使用渠道。没有明确的系统   需要进行呼叫,这可以是   耗时的。更重要的是,   虚拟内存系统的运行   系统自动缓存内存   页面。这些页面将被缓存   使用系统内存而不会   消耗JVM内存中的空间   堆。

     

一旦内存页面生效   (从磁盘引入),它可以   以全硬件速度再次访问   无需再制作另一个   系统调用获取数据。大,   包含索引的结构化文件   或其他引用的部分   或经常更新可以受益   来自内存映射。什么时候   结合文件锁定来保护   关键部分和控制   事务性原子性,你开始   看看内存映射缓冲区是怎样的   善加利用。

我真的怀疑你会发现第三方API做得比这更好。也许您可以在此体系结构之上找到一个API来简化工作。

难道你不认为这种方法对你有用吗?

答案 1 :(得分:11)

您可以使用类似

的代码从RandomAccessFile创建BufferedInputStream
 RandomAccessFile raf = ...
 FileInputStream fis = new FileInputStream(raf.getFD());
 BufferedInputStream bis = new BufferedInputStream(fis);

有些注意事项

  1. 关闭FileInputStream将关闭RandomAccessFile,反之亦然
  2. RandomAccessFile和FileInputStream指向相同的位置,因此从FileInputStream读取将提升RandomAccessFile的文件指针,反之亦然
  3. 你想要使用它的方式可能是,

    RandomAccessFile raf = ...
    FileInputStream fis = new FileInputStream(raf.getFD());
    BufferedInputStream bis = new BufferedInputStream(fis);
    
    //do some reads with buffer
    bis.read(...);
    bis.read(...);
    
    //seek to a a different section of the file, so discard the previous buffer
    raf.seek(...);
    bis = new BufferedInputStream(fis);
    bis.read(...);
    bis.read(...);
    

答案 2 :(得分:2)

  

RandomAccessFile对于随机访问文件非常慢。您经常阅读有关在其上实现缓冲层的信息,但是无法在线查找代码。

嗯,可以在网上找到。
首先,jpeg2000中的JAI源代码有一个实现,以及一个更加无阻碍的impl: http://www.unidata.ucar.edu/software/netcdf-java/

的javadocs:

http://www.unidata.ucar.edu/software/thredds/v4.3/netcdf-java/v4.0/javadoc/ucar/unidata/io/RandomAccessFile.html

答案 3 :(得分:1)

如果您在64位计算机上运行,​​那么内存映射文件是您最好的方法。只需将整个文件映射到一个大小相等的缓冲区数组中,然后根据需要为每个记录选择一个缓冲区(例如, edalorzo 的答案,但是你需要重叠的缓冲区,这样你就没有了跨越边界的记录。

如果您在32位JVM上运行,那么您将遇到RandomAccessFile。但是,您可以使用它来读取包含整个记录的byte[],然后使用ByteBuffer从该数组中检索单个值。在最坏的情况下,您应该进行两次文件访问:一次用于检索记录的位置/大小,另一次用于检索记录本身。

但是,请注意,如果您创建了大量byte[] s,就可以开始强调垃圾收集器,如果您在整个文件中反弹,则会保持IO限制。

答案 4 :(得分:1)

Apache PDFBox 项目有一个很好的经过测试的 BufferedRandomAccessFile 类。
在 Apache 许可下获得许可,版本 2.0

它是 java.io.RandomAccessFile 类的优化版本,如 Nick Zhang 在 JavaWorld.com 上所述。基于 jmzreader 实现并增强以处理无符号字节。

在此处查看源代码:

  1. https://github.com/apache/pdfbox/.../fontbox/ttf/BufferedRandomAccessFile.java
  2. https://svn.apache.org/repos/asf/pdfbox/.../fontbox/ttf/BufferedRandomAccessFile.java

答案 5 :(得分:0)

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;

/**
 * Adds caching to a random access file.
 * 
 * Rather than directly writing down to disk or to the system which seems to be
 * what random access file/file channel do, add a small buffer and write/read from
 * it when possible. A single buffer is created, which means reads or writes near 
 * each other will have a speed up. Read/writes that are not within the cache block 
 * will not be speed up. 
 * 
 *
 */
public class BufferedRandomAccessFile implements AutoCloseable {

    private static final int DEFAULT_BUFSIZE = 4096;

    /**
     * The wrapped random access file, we will hold a cache around it.
     */
    private final RandomAccessFile raf;

    /**
     * The size of the buffer
     */
    private final int bufsize;

    /**
     * The buffer.
     */
    private final byte buf[];


    /**
     * Current position in the file.
     */
    private long pos = 0;

    /**
     * When the buffer has been read, this tells us where in the file the buffer
     * starts at.
     */
    private long bufBlockStart = Long.MAX_VALUE;


    // Must be updated on write to the file
    private long actualFileLength = -1;

    boolean changeMadeToBuffer = false;

    // Must be update as we write to the buffer.
    private long virtualFileLength = -1;

    public BufferedRandomAccessFile(File name, String mode) throws FileNotFoundException {
        this(name, mode, DEFAULT_BUFSIZE);
    }

    /**
     * 
     * @param file
     * @param mode how to open the random access file.
     * @param b size of the buffer
     * @throws FileNotFoundException
     */
    public BufferedRandomAccessFile(File file, String mode, int b) throws FileNotFoundException {
        this(new RandomAccessFile(file, mode), b);
    }

    public BufferedRandomAccessFile(RandomAccessFile raf) throws FileNotFoundException {
        this(raf, DEFAULT_BUFSIZE);
    }

    public BufferedRandomAccessFile(RandomAccessFile raf, int b) {
        this.raf = raf;
        try {
            this.actualFileLength = raf.length();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        this.virtualFileLength = actualFileLength;
        this.bufsize = b;
        this.buf = new byte[bufsize];
    }

    /**
     * Sets the position of the byte at which the next read/write should occur.
     * 
     * @param pos
     * @throws IOException
     */
    public void seek(long pos) throws IOException{
        this.pos = pos;
    }

    /**
     * Sets the length of the file.
     */
    public void setLength(long fileLength) throws IOException {
        this.raf.setLength(fileLength);
        if(fileLength < virtualFileLength) {
            virtualFileLength = fileLength;
        }
    }

    /**
     * Writes the entire buffer to disk, if needed.
     */
    private void writeBufferToDisk() throws IOException {
        if(!changeMadeToBuffer) return;
        int amountOfBufferToWrite = (int) Math.min((long) bufsize, virtualFileLength - bufBlockStart);
        if(amountOfBufferToWrite > 0) {
            raf.seek(bufBlockStart);
            raf.write(buf, 0, amountOfBufferToWrite);
            this.actualFileLength = virtualFileLength;
        }
        changeMadeToBuffer = false;
    }

    /**
     * Flush the buffer to disk and force a sync.
     */
    public void flush() throws IOException {
        writeBufferToDisk();
        this.raf.getChannel().force(false);
    }

    /**
     * Based on pos, ensures that the buffer is one that contains pos
     * 
     * After this call it will be safe to write to the buffer to update the byte at pos,
     * if this returns true reading of the byte at pos will be valid as a previous write
     * or set length has caused the file to be large enough to have a byte at pos.
     * 
     * @return true if the buffer contains any data that may be read. Data may be read so long as
     * a write or the file has been set to a length that us greater than the current position.
     */
    private boolean readyBuffer() throws IOException {
        boolean isPosOutSideOfBuffer = pos < bufBlockStart || bufBlockStart + bufsize <= pos;

        if (isPosOutSideOfBuffer) {

            writeBufferToDisk();

            // The buffer is always positioned to start at a multiple of a bufsize offset.
            // e.g. for a buf size of 4 the starting positions of buffers can be at 0, 4, 8, 12..
            // Work out where the buffer block should start for the given position. 
            long bufferBlockStart = (pos / bufsize) * bufsize;

            assert bufferBlockStart >= 0;

            // If the file is large enough, read it into the buffer.
            // if the file is not large enough we have nothing to read into the buffer,
            // In both cases the buffer will be ready to have writes made to it.
            if(bufferBlockStart < actualFileLength) {
                raf.seek(bufferBlockStart);
                raf.read(buf);
            }

            bufBlockStart = bufferBlockStart;
        }

        return pos < virtualFileLength;
    }

    /**
     * Reads a byte from the file, returning an integer of 0-255, or -1 if it has reached the end of the file.
     * 
     * @return
     * @throws IOException 
     */
    public int read() throws IOException {
        if(readyBuffer() == false) {
            return -1;
        }
        try {
            return (buf[(int)(pos - bufBlockStart)]) & 0x000000ff ; 
        } finally {
            pos++;
        }
    }

    /**
     * Write a single byte to the file.
     * 
     * @param b
     * @throws IOException
     */
    public void write(byte b) throws IOException {
        readyBuffer(); // ignore result we don't care.
        buf[(int)(pos - bufBlockStart)] = b;
        changeMadeToBuffer = true;
        pos++;
        if(pos > virtualFileLength) {
            virtualFileLength = pos;
        }
    }

    /**
     * Write all given bytes to the random access file at the current possition.
     * 
     */
    public void write(byte[] bytes) throws IOException {
        int writen = 0;
        int bytesToWrite = bytes.length;
        {
            readyBuffer();
            int startPositionInBuffer = (int)(pos - bufBlockStart);
            int lengthToWriteToBuffer = Math.min(bytesToWrite - writen, bufsize - startPositionInBuffer);
            assert  startPositionInBuffer + lengthToWriteToBuffer <= bufsize;

            System.arraycopy(bytes, writen,
                            buf, startPositionInBuffer,
                            lengthToWriteToBuffer);
            pos += lengthToWriteToBuffer;
            if(pos > virtualFileLength) {
                virtualFileLength = pos;
            }
            writen += lengthToWriteToBuffer;
            this.changeMadeToBuffer = true;
        }

        // Just write the rest to the random access file
        if(writen < bytesToWrite) {
            writeBufferToDisk();
            int toWrite = bytesToWrite - writen;
            raf.write(bytes, writen, toWrite);
            pos += toWrite;
            if(pos > virtualFileLength) {
                virtualFileLength = pos;
                actualFileLength = virtualFileLength;
            }
        }
    }

    /**
     * Read up to to the size of bytes,
     * 
     * @return the number of bytes read.
     */
    public int read(byte[] bytes) throws IOException {
        int read = 0;
        int bytesToRead = bytes.length;
        while(read < bytesToRead) {

            //First see if we need to fill the cache
            if(readyBuffer() == false) {
                //No more to read;
                return read;
            }

            //Now read as much as we can (or need from cache and place it
            //in the given byte[]
            int startPositionInBuffer = (int)(pos - bufBlockStart);
            int lengthToReadFromBuffer = Math.min(bytesToRead - read, bufsize - startPositionInBuffer);

            System.arraycopy(buf, startPositionInBuffer, bytes, read, lengthToReadFromBuffer);

            pos += lengthToReadFromBuffer;
            read += lengthToReadFromBuffer;
        }

        return read;
    }

    public void close() throws IOException {
        try {
            this.writeBufferToDisk();
        } finally {
            raf.close();
        }
    }

    /**
     * Gets the length of the file.
     * 
     * @return
     * @throws IOException
     */
    public long length() throws IOException{
        return virtualFileLength;
    }

}