我正在加速处理非常大的文本文件(约100兆左右)。我已经小心谨慎使用redim保留调用,但该功能仍然需要5分钟左右才能运行。文本文件基本上是我试图解析的子报告。我只能访问大文件。什么是人。 VBA这么慢吗?这是代码,“Report”对象是我创建的类。大多数报告只有几百行,所以我选择1000作为ubound:
Public Function GetPages(originalFilePath As String) As Collection
Dim myReport As report
Dim reportPageCollection As Collection
Dim startLine As Long
Dim endLine As Long
Dim fso As FileSystemObject
Dim file As textStream
Dim lineStr As String
Dim index As Long
Dim lines() As String
Set fso = New FileSystemObject
Set reportPageCollection = New Collection 'initialize the collection
Set file = fso.OpenTextFile(originalFilePath, ForReading)
ReDim lines(0 To 1000)
lineStr = file.ReadLine 'skip the first line so the loop doesnt add a blank report
lines(0) = lineStr
index = 1
Do Until file.AtEndOfLine 'loop through from the startline to find the end line
lineStr = file.ReadLine
If lineStr Like "1JOBNAME:*" Then 'next report, so we want to return an array of the single line
'load this page into our report page collection for further processing
Set myReport = New report
myReport.setDataLines = lines() 'Fill in 'ReportPage' Array
reportPageCollection.Add myReport 'add our report to the collection
'set up array for new report
ReDim lines(0 To 1000)
index = 0
lines(index) = lineStr
index = index + 1
Else
'============================ store into array
If index = UBound(lines) Then
ReDim Preserve lines(0 To UBound(lines) + 1000)
lines(index) = lineStr
index = index + 1
Else
lines(index) = lineStr
index = index + 1
End If
'============================
End If
Loop
file.Close
Set fso = Nothing
Set GetPages = reportPageCollection
结束功能
感谢任何帮助。谢谢!
答案 0 :(得分:4)
我刚从C:\驱动器中抓取了一个73兆,1.2米的行文本文件。在Excel VBA中逐行阅读整个内容花了6秒钟(除了阅读之外什么都不做)。所以速度问题显然与文件IO无关。
一些观察结果:
Do Until file.
AtEndOfLine几乎立即停止:一旦您阅读了一行,就会在一行的最后。我想你想要Do Until file.AtEndOfStream
report
类所以你的代码可能缩小到这样:
Public Function GetPages(originalFilePath As String) As Collection
Dim myReport As report
Set GetPages = New Collection 'initialize the collection'
With New FileSystemObject ' no need to store an object'
With .OpenTextFile(originalFilePath, ForReading) ' ditto'
Set myReport = New report
myReport.AddLine .ReadLine
Do Until .AtEndOfStream
lineStr = file.ReadLine
If lineStr Like "1JOBNAME:*" Then
GetPages.Add myReport
Set myReport = New report
End If
myReport.AddLine lineStr ' all the array business happens here - much tidier'
Loop
End With ' TextStream goes out of scope & closes'
End With ' FileSystemObject goes out of scope, disappears'
End Function
那有什么帮助吗?
答案 1 :(得分:0)
您可以进行一些调整,已知FSO对象比VB的本机IO慢。但我在这里看不到任何令人发指的事情。在我们进行微优化之前,让我问一个更基本的问题......这些文件是否恰好位于共享驱动器或ftp站点上?如果是这样,请考虑在处理它们之前将它们复制到临时文件夹。
答案 2 :(得分:-4)
VBA这么慢吗?
是。试试XLW,一个excel的C ++包装器。