我的任务是处理一个3.2GB的固定宽度分隔文本文件。每行1563个字符长,文本文件中大约有210万行。读完大约100万行后,我的程序因内存不足异常错误而崩溃。
Imports System.IO
Imports Microsoft.VisualBasic.FileIO
Module TestFileCount
''' <summary>
''' Gets the total number of lines in a text file by reading a line at a time
''' </summary>
''' <remarks>Crashes when count reaches 1018890</remarks>
Sub Main()
Dim inputfile As String = "C:\Split\BIGFILE.txt"
Dim count As Int32 = 0
Dim lineoftext As String = ""
If File.Exists(inputfile) Then
Dim _read As New StreamReader(inputfile)
Try
While (_read.Peek <> -1)
lineoftext = _read.ReadLine()
count += 1
End While
Console.WriteLine("Total Lines in " & inputfile & ": " & count)
Catch ex As Exception
Console.WriteLine(ex.Message)
Finally
_read.Close()
End Try
End If
End Sub
End Module
这是一个非常简单的程序,一次读取一行文本文件,所以我认为它不应占用缓冲区中太多的内存。
对于我的生活,我无法弄清楚它为什么会崩溃。这里有没有人有任何想法?
答案 0 :(得分:1)
我不知道这是否能解决您的问题,但不要使用peek,将您的循环更改为:(这是C#,但您应该能够将其转换为VB)
while (_read.ReadLine() != null)
{
count += 1
}
如果你需要在循环中使用文本行而不是仅计算行,只需将代码修改为
while ((lineoftext = _read.ReadLine()) != null)
{
count += 1
//Do something with lineoftext
}
有点偏离主题和作弊,如果每一行真的是1563个字符长(包括行结尾)并且文件是纯ASCII(所以所有字符占用一个字节)你可以做(再一次C#但是你应该能够翻译)
long bytesPerLine = 1563;
string inputfile = @"C:\Split\BIGFILE.txt"; //The @ symbol is so we don't have to escape the `\`
long length;
using(FileStream stream = File.Open(inputFile, FileMode.Open)) //This is the C# equivilant of the try/finally to close the stream when done.
{
length = stream.Length;
}
Console.WriteLine("Total Lines in {0}: {1}", inputfile, (length / bytesPerLine ));
答案 1 :(得分:0)
尝试使用ReadAsync,或者您可以使用DiscardBufferedData(但这很慢)
Dim inputfile As String = "C:\Example\existingfile.txt"
Dim result() As String
Dim builder As StringBuilder = New StringBuilder()
Try
Using reader As StreamReader = File.OpenText(inputfile)
ReDim result(reader.BaseStream.Length)
Await reader.ReadAsync(result, 0, reader.BaseStream.Length)
End Using
For Each str As String In result
builder.Append(str)
Next
Dim count as Integer=builder.Count()
Console.WriteLine("Total Lines in " & inputfile & ": " & count)
Catch ex As Exception
Console.WriteLine(ex.Message)
End Try