BufferedReader用于大型ByteBuffer?

时间:2009-06-25 19:00:27

标签: java nio bufferedreader bytebuffer

有没有办法用BufferedReader读取ByteBuffer而不必先将它变成String?我想通过一个相当大的ByteBuffer读取文本行,出于性能原因,我想避免将其写入磁盘。在ByteBuffer上调用toString不起作用,因为生成的String太大(它会抛出java.lang.OutOfMemoryError:Java堆空间)。我原以为API中会有一些东西将ByteBuffer包装在一个合适的阅读器中,但我似乎找不到合适的东西。

这是一个缩写代码示例,说明了我在做什么):

// input stream is from Process getInputStream()
public String read(InputStream istream)
{
  ReadableByteChannel source = Channels.newChannel(istream);
  ByteArrayOutputStream ostream = new ByteArrayOutputStream(bufferSize);
  WritableByteChannel destination = Channels.newChannel(ostream);
  ByteBuffer buffer = ByteBuffer.allocateDirect(writeBufferSize);

  while (source.read(buffer) != -1)
  {
    buffer.flip();
    while (buffer.hasRemaining())
    {
      destination.write(buffer);
    }
    buffer.clear();
  }

  // this data can be up to 150 MB.. won't fit in a String.
  result = ostream.toString();
  source.close();
  destination.close();
  return result;
}

// after the process is run, we call this method with the String
public void readLines(String text)
{
  BufferedReader reader = new BufferedReader(new StringReader(text));
  String line;

  while ((line = reader.readLine()) != null)
  {
    // do stuff with line
  }
}

3 个答案:

答案 0 :(得分:5)

目前尚不清楚为什么要使用字节缓冲区来开始。如果您有一个InputStream并且想要阅读相关内容,那么为什么不使用InputStreamReader包裹的BufferedReader?让NIO参与其中有什么好处?

toString()调用ByteArrayOutputStream对我来说听起来不错,即使你有足够的空间:最好将其作为字节数组并将其包装在ByteArrayInputStream中然后是InputStreamReader,如果你真的需要ByteArrayOutputStream。如果真的想要调用toString(),至少要使用带有字符编码名称的重载 - 否则它将使用系统默认值,这可能不是什么你想要的。

编辑:好的,所以你真的想要使用NIO。您最终仍在写ByteArrayOutputStream,因此您最终会获得包含数据的BAOS。如果您想避免复制该数据,则需要从ByteArrayOutputStream派生,例如:

public class ReadableByteArrayOutputStream extends ByteArrayOutputStream
{
    /**
     * Converts the data in the current stream into a ByteArrayInputStream.
     * The resulting stream wraps the existing byte array directly;
     * further writes to this output stream will result in unpredictable
     * behavior.
     */
    public InputStream toInputStream()
    {
        return new ByteArrayInputStream(array, 0, count);
    }
}

然后你可以创建输入流,将其包装在InputStreamReader中,将其包裹在BufferedReader中,然后你就离开了。

答案 1 :(得分:4)

你可以使用NIO,但这里没有真正的需要。正如Jon Skeet所说:

public byte[] read(InputStream istream)
{
  ByteArrayOutputStream baos = new ByteArrayOutputStream();
  byte[] buffer = new byte[1024]; // Experiment with this value
  int bytesRead;

  while ((bytesRead = istream.read(buffer)) != -1)
  {
    baos.write(buffer, 0, bytesRead);
  }

  return baos.toByteArray();
}


// after the process is run, we call this method with the String
public void readLines(byte[] data)
{
  BufferedReader reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(data)));
  String line;

  while ((line = reader.readLine()) != null)
  {
    // do stuff with line
  }
}

答案 2 :(得分:0)

这是一个示例:

public class ByteBufferBackedInputStream extends InputStream {

    ByteBuffer buf;

    public ByteBufferBackedInputStream(ByteBuffer buf) {
        this.buf = buf;
    }

    public synchronized int read() throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }
        return buf.get() & 0xFF;
    }

    @Override
    public int available() throws IOException {
        return buf.remaining();
    }

    public synchronized int read(byte[] bytes, int off, int len) throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }

        len = Math.min(len, buf.remaining());
        buf.get(bytes, off, len);
        return len;
    }
}

你可以像这样使用它:

    String text = "this is text";   // It can be Unicode text
    ByteBuffer buffer = ByteBuffer.wrap(text.getBytes("UTF-8"));

    InputStream is = new ByteBufferBackedInputStream(buffer);
    InputStreamReader r = new InputStreamReader(is, "UTF-8");
    BufferedReader br = new BufferedReader(r);