这是与C#相关的。我们需要将整个源流复制到目标流,除了最后16个字节。
编辑:流的范围可达40GB,因此无法进行静态byte []分配(例如:.ToArray())
查看MSDN documentation,似乎只有当返回值为0时,我们才能可靠确定流的结尾。返回0
和{{1之间的值可以暗示字节“当前不可用”(真正意味着什么?)
目前,它按如下方式复制每个字节。 the requested size
和inStream
是通用的 - 可以是内存,磁盘或网络流(实际上也是一些)。
outStream
什么是可靠方式确保复制除最后16个之外的所有内容?我可以考虑在inStream上使用public static void StreamCopy(Stream inStream, Stream outStream)
{
var buffer = new byte[8*1024];
var last16Bytes = new byte[16];
int bytesRead;
while ((bytesRead = inStream.Read(buffer, 0, buffer.Length)) > 0)
{
outStream.Write(buffer, 0, bytesRead);
}
// Issues:
// 1. We already wrote the last 16 bytes into
// outStream (possibly over the n/w)
// 2. last16Bytes = ? (inStream may not necessarily support rewinding)
}
和Position
,但MSDN上有一个问题
如果从Stream派生的类不支持搜索,则调用Length,SetLength,Position和Seek会抛出NotSupportedException。
答案 0 :(得分:6)
从输入流中读取 1 和 n 字节。 1
将字节附加到circular buffer。 2
将第一个 max(0,b - 16)字节从循环缓冲区写入输出流,其中 b 是字节数循环缓冲区。
删除刚刚从循环缓冲区写入的字节。
转到第1步。
1 这就是Read
方法的作用 - 如果你调用int n = Read(buffer, 0, 500);
,它会将1到500个字节读入buffer
并返回数字字节读取。如果Read
返回0,则表示您已到达流的末尾。
2 为获得最佳性能,您可以直接将输入流中的字节读入循环缓冲区。这有点棘手,因为你必须处理缓冲区底层数组中的环绕声。
答案 1 :(得分:1)
以下解决方案快速且经过测试。希望它有用。它使用了您已经考虑过的双缓冲想法。 编辑:简化循环删除将第一次迭代与其余迭代分开的条件。
public static void StreamCopy(Stream inStream, Stream outStream) {
// Define the size of the chunk to copy during each iteration (1 KiB)
const int blockSize = 1024;
const int bytesToOmit = 16;
const int buffSize = blockSize + bytesToOmit;
// Generate working buffers
byte[] buffer1 = new byte[buffSize];
byte[] buffer2 = new byte[buffSize];
// Initialize first iteration
byte[] curBuffer = buffer1;
byte[] prevBuffer = null;
int bytesRead;
// Attempt to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We succesfully retrieved a whole buffer, we will output
// only [blockSize] bytes, to avoid writing to the last
// bytes in the buffer in case the remaining 16 bytes happen to
// be the last ones
outStream.Write(curBuffer, 0, blockSize);
} else {
// We couldn't retrieve the whole buffer
int bytesToWrite = bytesRead - bytesToOmit;
if( bytesToWrite > 0 ) {
outStream.Write(curBuffer, 0, bytesToWrite);
}
// There's no more data to process
return;
}
curBuffer = buffer2;
prevBuffer = buffer1;
while( true ) {
// Attempt again to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We retrieved the whole buffer, output first the last 16
// bytes of the previous buffer, and output just [blockSize]
// bytes from the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, blockSize);
} else {
// We could not retrieve a complete buffer
if( bytesRead <= bytesToOmit ) {
// The bytes to output come solely from the previous buffer
outStream.Write(prevBuffer, blockSize, bytesRead);
} else {
// The bytes to output come from the previous buffer and
// the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, bytesRead - bytesToOmit);
}
break;
}
// swap buffers for next iteration
byte[] swap = prevBuffer;
prevBuffer = curBuffer;
curBuffer = swap;
}
}
static void Assert(Stream inStream, Stream outStream) {
// Routine that tests the copy worked as expected
inStream.Seek(0, SeekOrigin.Begin);
outStream.Seek(0, SeekOrigin.Begin);
Debug.Assert(outStream.Length == Math.Max(inStream.Length - bytesToOmit, 0));
for( int i = 0; i < outStream.Length; i++ ) {
int byte1 = inStream.ReadByte();
int byte2 = outStream.ReadByte();
Debug.Assert(byte1 == byte2);
}
}
更简单的代码解决方案,但由于它在字节级工作,因此速度较慢,就是在输入流和输出流之间使用中间队列。该进程将首先从输入流中读取16个字节并将其排队。然后它将迭代剩余的输入字节,从输入流中读取单个字节,将其排队然后使字节出列。出队的字节将被写入输出流,直到处理来自输入流的所有字节。不需要的16个字节应该留在中间队列中。
希望这有帮助!
=)
答案 2 :(得分:0)
使用循环缓冲区听起来很棒,但.NET中没有循环缓冲类,这意味着还需要额外的代码。我最终得到了以下算法,一种map and copy
- 我认为这很简单。为了在这里进行自我描述,变量名称比平常更长。
这通过缓冲区流动
[outStream] <== [tailBuf] <== [mainBuf] <== [inStream]
public byte[] CopyStreamExtractLastBytes(Stream inStream, Stream outStream,
int extractByteCount)
{
//var mainBuf = new byte[1024*4]; // 4K buffer ok for network too
var mainBuf = new byte[4651]; // nearby prime for testing
int mainBufValidCount;
var tailBuf = new byte[extractByteCount];
int tailBufValidCount = 0;
while ((mainBufValidCount = inStream.Read(mainBuf, 0, mainBuf.Length)) > 0)
{
// Map: how much of what (passthru/tail) lives where (MainBuf/tailBuf)
// more than tail is passthru
int totalPassthruCount = Math.Max(0, tailBufValidCount +
mainBufValidCount - extractByteCount);
int tailBufPassthruCount = Math.Min(tailBufValidCount, totalPassthruCount);
int tailBufTailCount = tailBufValidCount - tailBufPassthruCount;
int mainBufPassthruCount = totalPassthruCount - tailBufPassthruCount;
int mainBufResidualCount = mainBufValidCount - mainBufPassthruCount;
// Copy: Passthru must be flushed per FIFO order (tailBuf then mainBuf)
outStream.Write(tailBuf, 0, tailBufPassthruCount);
outStream.Write(mainBuf, 0, mainBufPassthruCount);
// Copy: Now reassemble/compact tail into tailBuf
var tempResidualBuf = new byte[extractByteCount];
Array.Copy(tailBuf, tailBufPassthruCount, tempResidualBuf, 0,
tailBufTailCount);
Array.Copy(mainBuf, mainBufPassthruCount, tempResidualBuf,
tailBufTailCount, mainBufResidualCount);
tailBufValidCount = tailBufTailCount + mainBufResidualCount;
tailBuf = tempResidualBuf;
}
return tailBuf;
}