我正在处理一些大小为1-2 Gig的文本文件。我不能使用传统的streamreader并决定阅读chuncks并完成我的工作。问题是我不确定何时到达文件的末尾,因为它已经在一个文件上工作了很长时间,而且我不确定我可以通过缓冲区读取多大。这是代码:
dim Buffer_Size = 30000
dim bufferread = new [Char](Buffer_Size - 1){}
dim bytesread as integer = 0
dim totalbytesread as integer = 0
dim sb as new stringbuilder
Do
bytesread = inputfile.read(bufferread, 0 , Buffer_Size)
sb.append(bufferread)
totalbytesread = bytesread + totalbytesread
if sb.length > 9999999 then
data = sb.tostring
if not data is nothing then
parsingtools.load(data)
endif
endif
if totalbytesread > 1000000000 then
logs.constructlog("File almost done")
endif
loop until inputfile.endofstream
是否有任何控件或代码可以检查文件的剩余部分?
答案 0 :(得分:1)
你看过BufferedStream吗?
http://msdn.microsoft.com/en-us/library/system.io.bufferedstream%28v=VS.100%29.aspx
你可以用它包装你的流。另外,我将缓冲区大小设置为megs,而不是小到30,000。
剩下多少钱?你可以直接询问流的长度吗?
下面是我用于在流中包装缓冲流的代码片段。 (对不起,这是c#)
private static void CopyTo(AzureBlobStore azureBlobStore,Stream src, Stream dest, string description)
{
if (src == null)
throw new ArgumentNullException("src");
if (dest == null)
throw new ArgumentNullException("dest");
const int bufferSize = (AzureBlobStore.BufferSizeForStreamTransfers);
// buffering happening internally. this is just to avoid 4gig boundary and have something to show
int readCount;
//long bytesTransfered = 0;
var buffer = new byte[bufferSize];
//string totalBytes = FormatBytes(src.Length);
while ((readCount = src.Read(buffer, 0, buffer.Length)) != 0)
{
if (azureBlobStore.CancelProcessing)
{
break;
}
dest.Write(buffer, 0, readCount);
//bytesTransfered += readCount;
//Console.WriteLine("AzureBlobStore:CopyTo:{0}:{1} {2}", FormatBytes(bytesTransfered), totalBytes,description);
}
}
希望这有帮助。