我无法使用NIO MappedByteBuffer函数读取非常大的地震文件。我的程序读取的格式称为SEGY,由地震数据样本和关于其他项目的元数据组成,其中包括地震数据的数字ID和XY坐标。
格式的结构相当固定,有一个240字节的标题,后面是构成每个地震轨迹的固定数量的数据样本。每个跟踪的样本数可能因文件而异,但通常在1000到2000之间。
样本可以写为单字节,16或32位整数,或IBM或IEEE浮点数。每个跟踪头中的数据同样可以是任何上述格式。为了进一步混淆问题,SEGY文件可以是大字节或小字节顺序。
文件的大小范围可以从3600字节到几兆字节。
我的应用程序是SEGY编辑器和查看器。对于它执行的许多功能,我必须只读取一个或两个变量,比如每个跟踪头的长整数。
目前我正在从RandomAccessFile读取一个字节缓冲区,然后从视图缓冲区中提取所需的变量。这可行,但对于非常大的文件来说非常慢。
我使用映射的字节缓冲区编写了一个新的文件处理程序,该文件缓冲区将文件分成5000个跟踪MappedByteBuffers。这很好用,并且非常快,直到我的系统内存不足,然后它慢慢爬行,我被迫重新启动只是为了让我的Mac再次可用。
由于某种原因,即使我的程序完成,缓冲区中的内存也永远不会被释放。我需要进行清除或重启。
这是我的代码。任何建议都将非常受欢迎。
package MyFileHandler;
import java.io.*;
import java.nio.*;
import java.nio.channels.FileChannel;
import java.util.ArrayList;
public class MyFileHandler
{
/*
A buffered file IO class that keeps NTRACES traces in memory for reading and writing.
the buffers start and end at trace boundaries and the buffers are sequential
i.e 1-20000,20001-40000, etc
The last, or perhaps only buffer will contain less than NTRACES up to the last trace
The arrays BufferOffsets and BufferLengths contain the start and length for all the
buffers required to read and write to the file
*/
private static int NTRACES = 5000;
private boolean HighByte;
private long FileSize;
private int BytesPerTrace;
private FileChannel FileChnl;
private MappedByteBuffer Buffer;
private long BufferOffset;
private int BufferLength;
private long[] BufferOffsets;
private int[] BufferLengths;
private RandomAccessFile Raf;
private int BufferIndex;
private ArrayList Maps;
public MyFileHandler(RandomAccessFile raf, int bpt)
{
try
{
HighByte = true;
// allocate a filechannel to the file
FileChnl = raf.getChannel();
FileSize = FileChnl.size();
BytesPerTrace = bpt;
SetUpBuffers();
BufferIndex = 0;
GetNewBuffer(0);
} catch (IOException ioe)
{
ioe.printStackTrace();
}
}
private void SetUpBuffers()
{
// get number of traces in entire file
int ntr = (int) ((FileSize - 3600) / BytesPerTrace);
int nbuffs = ntr / NTRACES;
// add one to nbuffs unless filesize is multiple of NTRACES
if (Math.IEEEremainder(ntr, NTRACES) != 0)
{
nbuffs++;
}
BufferOffsets = new long[nbuffs];
BufferLengths = new int[nbuffs];
// BuffOffset are in bytes, not trace numbers
//get the offsets and lengths of each buffer
for (int i = 0; i < nbuffs; i++)
{
if (i == 0)
{
// first buffer contains EBCDIC header 3200 bytes and binary header 400 bytes
BufferOffsets[i] = 0;
BufferLengths[i] = 3600 + (Math.min(ntr, NTRACES) * BytesPerTrace);
} else
{
BufferOffsets[i] = BufferOffsets[i - 1] + BufferLengths[i - 1];
BufferLengths[i] = (int) (Math.min(FileSize - BufferOffsets[i], NTRACES * BytesPerTrace));
}
}
GetMaps();
}
private void GetMaps()
{
// map the file to list of MappedByteBuffer
Maps = new ArrayList(BufferOffsets.length);
try
{
for(int i=0;i<BufferOffsets.length;i++)
{
MappedByteBuffer map = FileChnl.map(FileChannel.MapMode.READ_WRITE, BufferOffsets[i], BufferLengths[i]);
SetByteOrder(map);
Maps.add(map);
}
} catch (IOException ioe)
{
ioe.printStackTrace();
}
}
private void GetNewBuffer(long offset)
{
if (Buffer == null || offset < BufferOffset || offset >= BufferOffset + BufferLength)
{
BufferIndex = GetBufferIndex(offset);
BufferOffset = BufferOffsets[BufferIndex];
BufferLength = BufferLengths[BufferIndex];
Buffer = (MappedByteBuffer)Maps.get(BufferIndex);
}
}
private int GetBufferIndex(long offset)
{
int indx = 0;
for (int i = 0; i < BufferOffsets.length; i++)
{
if (offset >= BufferOffsets[i] && offset < BufferOffsets[i]+BufferLengths[i])
{
indx = i;
break;
}
}
return indx;
}
private void SetByteOrder(MappedByteBuffer ByteBuff)
{
if (HighByte)
{
ByteBuff.order(ByteOrder.BIG_ENDIAN);
} else
{
ByteBuff.order(ByteOrder.LITTLE_ENDIAN);
}
}
// public methods to read, (get) or write (put) an array of types, byte, short, int, or float.
// for sake of brevity only showing get and put for ints
public void Get(int[] buff, long offset)
{
GetNewBuffer(offset);
Buffer.position((int) (offset - BufferOffset));
Buffer.asIntBuffer().get(buff);
}
public void Put(int[] buff, long offset)
{
GetNewBuffer(offset);
Buffer.position((int) (offset - BufferOffset));
Buffer.asIntBuffer().put(buff);
}
public void HighByteOrder(boolean hb)
{
// all byte swapping is done by the buffer class
// set all allocated buffers to same byte order
HighByte = hb;
}
public int GetBuffSize()
{
return BufferLength;
}
public void Close()
{
try
{
FileChnl.close();
} catch (Exception e)
{
e.printStackTrace();
}
}
}
答案 0 :(得分:0)
您正在通过可能大量的foo
将整个文件映射到内存中,并且当您将它们保存在MappedByteBuffers
时,它们永远不会被释放。这毫无意义。您也可以使用单 Map
映射整个文件,或者克服地址限制所需的最小数量。使用它们比你需要的更多没有任何好处。
但我只会映射当前正在查看/编辑的文件片段,并在用户移动到另一个片段时将其释放。
我很惊讶发现MappedByteBuffer
的速度要快得多。我上次测试时,通过映射字节缓冲区读取的速度仅比MappedByteBuffer
快20%,并且完全没有写入。我希望看到RandomAccessFile
代码,因为它似乎可能有些问题可以轻松修复。