VBA读取大文本文件并删除不需要的文本行

时间:2017-07-27 19:38:42

标签: vba file text binary access

我正在尝试清理一个大文本文件(重复的标题/预告片)以导入到访问表中。 我在前一篇文章中重申了这一点:Read Number of lines in Large Text File VB6

我要做的是切换:打开s_inp_file输入为inp_file_num打开s_inp_file进行二进制访问读取#inp_file_num进行二进制读取以使进程更快。 Input方法耗时太长。问题是在我的代码中我打开二进制时不能使用行输入,我得到一个运行时错误#62 - 输入到达循环中的文件末尾。提前谢谢。

代码:

Sub Scrub_TextLines(s_inp_file As String, s_out_file As String)
    Dim inp_file_num As Integer
    Dim out_file_num As Integer
    Dim text_line As String
    Dim file_content As String
    Dim buffer() As Byte
    Dim i As Long

    Const remove_text1 As String = "REPORTING SERVICE"
'    Const remove_text2 As String = "PRODUCT TYPE -"
'    Const remove_text3 As String = "ALL REPORT"
    Const remove_text4 As String = "CLIENT NAME"
    Const remove_text5 As String = "CLIENT ID"
    Const remove_text6 As String = "TY YEAR"
    Const remove_text7 As String = "CLIENT-ID2"
    Const remove_text8 As String = "----"

    inp_file_num = FreeFile
    Open s_inp_file For Binary Access Read As #inp_file_num
    ReDim buffer(LOF(inp_file_num) - 1)
    Get #inp_file_num, , buffer
        Do Until EOF(inp_file_num)
            Line Input #inp_file_num, text_line
            If text_line <> "" And _
                Left(text_line, 4) <> "    " And _
                    InStr(1, text_line, remove_text1, vbTextCompare) = 0 And _
                    InStr(1, text_line, remove_text4, vbTextCompare) = 0 And _
                    InStr(1, text_line, remove_text5, vbTextCompare) = 0 And _
                    InStr(1, text_line, remove_text6, vbTextCompare) = 0 And _
                    InStr(1, text_line, remove_text7, vbTextCompare) = 0 And _
                    InStr(1, text_line, remove_text8, vbTextCompare) = 0 Then

                file_content = file_content & text_line & vbCrLf
'                Debug.Print file_content
            End If
        Loop
    Close #inp_file_num

    out_file_num = FreeFile
    Open s_out_file For Output As out_file_num
        Print #out_file_num, file_content
    Close #out_file_num
End Sub

2 个答案:

答案 0 :(得分:0)

尝试更改:

Do Until EOF(inp_file_num)

要:

Do While Not EOF(inp_file_num)

答案 1 :(得分:-1)

删除以下行:

Get #inp_file_num, , buffer

如果您阅读了所有文件内容,则指针位于文件末尾,因此您无法读取新行。