Question

我正在努力了解输入流的工作原理。以下代码块是从文本文件中读取数据的众多方法之一： -

File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int data = 0;

while (data != -1)   (-1 means we reached the end of the file)
 {
     data = input.read(); //if a character was read, it'll be turned to a bite and we get the integer representation of it so a is 97 b is 98
     System.out.println(data + (char)data); //this will print the numbers followed by space then the character
 }

input.close();

现在使用input.read(byte, offset, length)我有这个代码。我是从here

得到的

File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int totalBytesRead = 0, bytesRemaining, bytesRead;
byte[] result = new byte[ ( int ) file.length()];

while ( totalBytesRead < result.length )
 {
     bytesRemaining = result.length - totalBytesRead;
     bytesRead = input.read ( result, totalBytesRead, bytesRemaining );
     if ( bytesRead > 0 )
      totalBytesRead = totalBytesRead + bytesRead;

     //printing integer version of bytes read
     for (int i = 0; i < bytesRead; i++)
      System.out.print(result[i] + " ");

     System.out.println();

     //printing character version of bytes read
     for (int i = 0; i < bytesRead; i++)
      System.out.print((char)result[i]);
 }

input.close();

我假设基于名称BYTESREAD，此读取方法返回读取的字节数。在文档中，它说该函数将尝试尽可能多地读取。所以可能有理由不这样做。

我的第一个问题是：这些原因是什么？

我可以用一行代码替换整个while循环：input.read(result, 0, result.length)

我确信文章的创建者会想到这一点。它不是关于输出，因为我在两种情况下得到相同的输出。所以必须有一个理由。最后一个。它是什么？

Answer 1

documentation of read(byte[],int,int说：

读取最多len 个字节的数据。
尝试可读取 len个字节
可以阅读较小的数字。

由于我们正在使用硬盘中的文件，因此预计尝试将读取整个文件似乎是合理的，但input.read(result, 0, result.length)无法保证读取整个文件（文档中的任何地方都没有说）。当未记录的行为发生变化时，依赖未记录的行为是错误的来源。

例如，文件流可能在其他JVM中实现不同，某些操作系统可能会对您可能一次读取的字节数施加限制，文件可能位于网络中，或者您可能稍后使用该文件流一段代码与另一个stream实现，它不会以这种方式运行。

或者，如果您正在读取数组中的整个文件，也许您可以使用DataInputStream.readFully

关于read()的循环，每次读取一个字节。如果您正在读取大量数据，这会降低性能，因为每次调用read()都会执行多次测试（流已结束？等），并可能要求操作系统输入一个字节。由于您已经知道需要file.length()个字节，因此没有理由不使用其他更有效的表单。

Answer 2

想象一下，您正在从网络套接字读取，而不是从文件中读取。在这种情况下，您没有任何有关流中字节总数的信息。您将分配一个固定大小的缓冲区并在循环中从流中读取。在循环的一次迭代期间，您不能指望流中有 BUFFERSIZE 字节。因此，您将尽可能多地填充缓冲区并再次迭代，直到缓冲区已满。如果您有固定大小的数据块，例如序列化对象，这可能很有用。

ArrayList<MyObject> list = new ArrayList<MyObject>();

try {

    InputStream input = socket.getInputStream();
    byte[] buffer = new byte[1024];

    int bytesRead;
    int off = 0;
    int len = 1024;

    while(true) {
        bytesRead = input.read(buffer, off, len);            

        if(bytesRead == len) {
                list.add(createMyObject(buffer));
                // reset variables
                off = 0;
                len = 1024;
                continue;
        }

        if(bytesRead == -1) break;

        // buffer is not full, adjust size
        off += bytesRead;
        len -= bytesRead;
    }

} catch(IOException io) {
    // stream was closed
}

PS。代码未经过测试，只应指出此功能如何有用。

Answer 3

您指定要读取的字节数，因为您可能不想一次读取整个文件，或者您可能不想或可能不想创建与文件一样大的缓冲区。

input.read和input.read之间的区别（数组，偏移量，长度）

3 个答案: