将非结构化文本文件解析并导入Microsoft Access(文件具有潜在的分隔符)

时间:2013-10-08 16:12:16

标签: vba ms-access vbscript access-vba text-parsing

我有一堆文本文件需要导入MS Access(数千个) - 可以使用2007或2010.文本文件的类别在方括号中标识,并在类别之间包含相关数据 - 例如:

  

[位置] Tenessee [位置] [模型] 042200 [模型] [PARTNO] 113342A69447B6 [PARTNO]。

我需要捕获它们之间的类别和数据,并将它们导入Access - 一个表的类别,另一个表的数据。单个文件中有数百个这样的类别,文本文件没有结构 - 它们都像上面的例子一样运行。括号中的类别是唯一明确的分隔符。

通过对网络的研究,我提出了一个VBS脚本(我没有锁定VBS,愿意使用VBA或其他方法),但是当我运行它时,我得到一个没有任何显示的VBS信息窗口在里面。任何建议或指导都将非常感激(我不倾向于使用VBS和VBA),我感谢你。

剧本:

Const ForReading = 1

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile("C:\Users\testGuy\Documents\dmc_db_test\DMC-TEST-A-00-00-00-00A-022A-D_000 - Copy01.txt", ForReading)

    strContents = objFile.ReadAll
    objFile.Close

    Set objRegEx = CreateObject("VBScript.RegExp")

    objRegEx.Global = True   
    objRegEx.Pattern = "\[.{0,}\]"

    Set colMatches = objRegEx.Execute(strContents)  

    If colMatches.Count > 0 Then
       For Each strMatch in colMatches   
           strMatches = strMatches & strMatch.Value 
       Next
    End If

    strMatches = Replace(strMatches, "]", vbCrlf)
    strMatches = Replace(strMatches, "[", "")

    Wscript.Echo strMatches

1 个答案:

答案 0 :(得分:4)

正则表达式是很棒的东西,但在你的情况下,看起来它们可能有点矫枉过正。以下代码使用普通旧InStr()查找[Tags]并将文件解析为单个CSV文件。也就是说,对于输入文件

testfile1.txt:

[Location]Tennessee[Location][Model]042200[Model][PartNo]113342A69447B6[PartNo]
[Location]Mississippi[Location][Model]042200[Model][SerialNo]3212333222355[SerialNo]

和testfile2.txt:

[Location]Missouri[Location][Model]042200[Model][PartNo]AAABBBCCC111222333[PartNo]

...代码将写入以下输出文件......

"FileName","LineNumber","ItemNumber","FieldName","FieldValue"
"testfile1.txt",1,1,"Location","Tennessee"
"testfile1.txt",1,2,"Model","042200"
"testfile1.txt",1,3,"PartNo","113342A69447B6"
"testfile1.txt",2,1,"Location","Mississippi"
"testfile1.txt",2,2,"Model","042200"
"testfile1.txt",2,3,"SerialNo","3212333222355"
"testfile2.txt",1,1,"Location","Missouri"
"testfile2.txt",1,2,"Model","042200"
"testfile2.txt",1,3,"PartNo","AAABBBCCC111222333"

...然后您可以将其导入Access(或其他)并从那里继续。这是VBA代码,但可以很容易地进行调整以作为VBScript运行。

Sub ParseSomeFiles()
Const InFolder = "C:\__tmp\parse\in\"
Const OutFile = "C:\__tmp\parse\out.csv"
Dim fso As FileSystemObject, f As File, tsIn As TextStream, tsOut As TextStream
Dim s As String, Lines As Long, Items As Long, i As Long
Set fso = New FileSystemObject
Set tsOut = fso.CreateTextFile(OutFile, True)
tsOut.WriteLine """FileName"",""LineNumber"",""ItemNumber"",""FieldName"",""FieldValue"""
For Each f In fso.GetFolder(InFolder).Files
    Debug.Print "Parsing """ & f.Name & """..."
    Set tsIn = f.OpenAsTextStream(ForReading)
    Lines = 0
    Do While Not tsIn.AtEndOfStream
        s = Trim(tsIn.ReadLine)
        Lines = Lines + 1
        Items = 0
        Do While Len(s) > 0
            Items = Items + 1
            tsOut.Write """" & f.Name & """," & Lines & "," & Items
            i = InStr(1, s, "]", vbBinaryCompare)
            ' write out FieldName
            tsOut.Write ",""" & Replace(Mid(s, 2, i - 2), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i + 1)
            i = InStr(1, s, "[", vbBinaryCompare)
            ' write out FieldValue
            tsOut.Write ",""" & Replace(Mid(s, 1, i - 1), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i)
            i = InStr(1, s, "]", vbBinaryCompare)
            ' (no need to write out ending FieldName tag)
            s = Mid(s, i + 1)
            tsOut.WriteLine
        Loop
    Loop
    tsIn.Close
    Set tsIn = Nothing
Next
Set f = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Debug.Print "Done."
End Sub