复制除流的最后16个字节以外的所有字节?早期检测到流末?

时间:2013-06-18 16:24:47

标签: c# .net stream networkstream

这是与C#相关的。我们需要将整个源流复制到目标流,除了最后16个字节。

编辑:流的范围可达40GB,因此无法进行静态byte []分配(例如:.ToArray())

查看MSDN documentation,似乎只有当返回值为0时,我们才能可靠确定流的结尾。返回0和{{1之间的值可以暗示字节“当前不可用”(真正意味着什么?)

目前,它按如下方式复制每个字节。 the requested sizeinStream是通用的 - 可以是内存,磁盘或网络流(实际上也是一些)。

outStream

什么是可靠方式确保复制除最后16个之外的所有内容?我可以考虑在inStream上使用public static void StreamCopy(Stream inStream, Stream outStream) { var buffer = new byte[8*1024]; var last16Bytes = new byte[16]; int bytesRead; while ((bytesRead = inStream.Read(buffer, 0, buffer.Length)) > 0) { outStream.Write(buffer, 0, bytesRead); } // Issues: // 1. We already wrote the last 16 bytes into // outStream (possibly over the n/w) // 2. last16Bytes = ? (inStream may not necessarily support rewinding) } Position,但MSDN上有一个问题

  

如果从Stream派生的类不支持搜索,则调用Length,SetLength,Position和Seek会抛出NotSupportedException。

3 个答案:

答案 0 :(得分:6)

  1. 从输入流中读取 1 n 字节。 1

  2. 将字节附加到circular buffer 2

  3. 将第一个 max(0,b - 16)字节从循环缓冲区写入输出流,其中 b 是字节数循环缓冲区。

  4. 删除刚刚从循环缓冲区写入的字节。

  5. 转到第1步。

  6. 1 这就是Read方法的作用 - 如果你调用int n = Read(buffer, 0, 500);,它会将1到500个字节读入buffer并返回数字字节读取。如果Read返回0,则表示您已到达流的末尾。

    2 为获得最佳性能,您可以直接将输入流中的字节读入循环缓冲区。这有点棘手,因为你必须处理缓冲区底层数组中的环绕声。

答案 1 :(得分:1)

以下解决方案快速且经过测试。希望它有用。它使用了您已经考虑过的双缓冲想法。 编辑:简化循环删除将第一次迭代与其余迭代分开的条件。

public static void StreamCopy(Stream inStream, Stream outStream) {
     // Define the size of the chunk to copy during each iteration (1 KiB)
     const int blockSize = 1024;
     const int bytesToOmit = 16;

     const int buffSize = blockSize + bytesToOmit;

     // Generate working buffers
     byte[] buffer1 = new byte[buffSize];
     byte[] buffer2 = new byte[buffSize];

     // Initialize first iteration
     byte[] curBuffer = buffer1;
     byte[] prevBuffer = null;

     int bytesRead;

     // Attempt to fully fill the buffer
     bytesRead = inStream.Read(curBuffer, 0, buffSize);
     if( bytesRead == buffSize ) {
        // We succesfully retrieved a whole buffer, we will output
        // only [blockSize] bytes, to avoid writing to the last
        // bytes in the buffer in case the remaining 16 bytes happen to 
        // be the last ones
        outStream.Write(curBuffer, 0, blockSize);
     } else {
        // We couldn't retrieve the whole buffer
        int bytesToWrite = bytesRead - bytesToOmit;
        if( bytesToWrite > 0 ) {
           outStream.Write(curBuffer, 0, bytesToWrite);
        }
        // There's no more data to process
        return;
     }

     curBuffer = buffer2;
     prevBuffer = buffer1;

     while( true ) {
        // Attempt again to fully fill the buffer
        bytesRead = inStream.Read(curBuffer, 0, buffSize);
        if( bytesRead == buffSize ) {
           // We retrieved the whole buffer, output first the last 16 
           // bytes of the previous buffer, and output just [blockSize]
           // bytes from the current buffer
           outStream.Write(prevBuffer, blockSize, bytesToOmit);
           outStream.Write(curBuffer, 0, blockSize);
        } else {
           // We could not retrieve a complete buffer 
           if( bytesRead <= bytesToOmit ) {
              // The bytes to output come solely from the previous buffer
              outStream.Write(prevBuffer, blockSize, bytesRead);
           } else {
              // The bytes to output come from the previous buffer and
              // the current buffer
              outStream.Write(prevBuffer, blockSize, bytesToOmit);
              outStream.Write(curBuffer, 0, bytesRead - bytesToOmit);
           }
           break;
        }
        // swap buffers for next iteration
        byte[] swap = prevBuffer;
        prevBuffer = curBuffer;
        curBuffer = swap;
     }
  }

static void Assert(Stream inStream, Stream outStream) {
   // Routine that tests the copy worked as expected
         inStream.Seek(0, SeekOrigin.Begin);
         outStream.Seek(0, SeekOrigin.Begin);
         Debug.Assert(outStream.Length == Math.Max(inStream.Length - bytesToOmit, 0));
         for( int i = 0; i < outStream.Length; i++ ) {
            int byte1 = inStream.ReadByte();
            int byte2 = outStream.ReadByte();
            Debug.Assert(byte1 == byte2);
         }

      }

更简单的代码解决方案,但由于它在字节级工作,因此速度较慢,就是在输入流和输出流之间使用中间队列。该进程将首先从输入流中读取16个字节并将其排队。然后它将迭代剩余的输入字节,从输入流中读取单个字节,将其排队然后使字节出列。出队的字节将被写入输出流,直到处理来自输入流的所有字节。不需要的16个字节应该留在中间队列中。

希望这有帮助!

=)

答案 2 :(得分:0)

使用循环缓冲区听起来很棒,但.NET中没有循环缓冲类,这意味着还需要额外的代码。我最终得到了以下算法,一种map and copy - 我认为这很简单。为了在这里进行自我描述,变量名称比平常更长。

这通过缓冲区流动

[outStream] <== [tailBuf] <== [mainBuf] <== [inStream]

public byte[] CopyStreamExtractLastBytes(Stream inStream, Stream outStream,
                                         int extractByteCount)
{
    //var mainBuf = new byte[1024*4]; // 4K buffer ok for network too
    var mainBuf = new byte[4651]; // nearby prime for testing

    int mainBufValidCount;
    var tailBuf = new byte[extractByteCount];
    int tailBufValidCount = 0;

    while ((mainBufValidCount = inStream.Read(mainBuf, 0, mainBuf.Length)) > 0)
    {
        // Map: how much of what (passthru/tail) lives where (MainBuf/tailBuf)
        // more than tail is passthru
        int totalPassthruCount = Math.Max(0, tailBufValidCount + 
                                    mainBufValidCount - extractByteCount);
        int tailBufPassthruCount = Math.Min(tailBufValidCount, totalPassthruCount);
        int tailBufTailCount = tailBufValidCount - tailBufPassthruCount;
        int mainBufPassthruCount = totalPassthruCount - tailBufPassthruCount;
        int mainBufResidualCount = mainBufValidCount - mainBufPassthruCount;

        // Copy: Passthru must be flushed per FIFO order (tailBuf then mainBuf)
        outStream.Write(tailBuf, 0, tailBufPassthruCount);
        outStream.Write(mainBuf, 0, mainBufPassthruCount);

        // Copy: Now reassemble/compact tail into tailBuf
        var tempResidualBuf = new byte[extractByteCount];
        Array.Copy(tailBuf, tailBufPassthruCount, tempResidualBuf, 0, 
                      tailBufTailCount);
        Array.Copy(mainBuf, mainBufPassthruCount, tempResidualBuf, 
                      tailBufTailCount, mainBufResidualCount);
        tailBufValidCount = tailBufTailCount + mainBufResidualCount;
        tailBuf = tempResidualBuf;
    }
    return tailBuf;
}