解析文本文件记录多行

时间:2019-06-03 20:41:55

标签: vb.net

我正在尝试解析一个文本文件,其中一条记录可能跨越4行。在下面的文本中,我发现16145是记录的开始,其中88是续行。目标是在表中获取数据,其中“ Comp Name”,“ Cust Name”,“ Desc”等是字段名称。

  

16,145,531299,S,531299,000,000,,36358906393192 /   
88,收到ACH信用-客户ID:AP0042168896描述:COMM OF ND比较名称:   
88,COMM OF ND公司ID:ND TR DPT SEC:CCD客户名称:Dakota Central日期:   
88,11-28-18时间:05:52 AM附录:705PA529-Kerry Adam   
16,145,520000,S,520000,000,000,,36358906393216 /   
88,收到ACH信用-客户ID:AP0042168908描述:COMM OF ND公司名称:   
88,COMM OF ND公司ID:ND TR DPT SEC:CCD客户名称:Dakota Central日期:   
88,11-28-18时间:05:52 AM附录:705PA529-Ladson Maria   
16,145,517500,S,517500,000,000,,36361011907140 /   
88,收到ACH信用-客户ID:368908356002797描述:MERCH DEP完整名称:   
88,BANKCARD Comp ID:1246825337 SEC:CCD Cust名称:WRTI日期:11-28-18   
88,时间:05:36 AM附录:无附录

我可以轻松地将文本文件读取到文本框中。我的想法是,是否可以用“ 16,1”开头的每一行加载文件,其中“ 88s”与“ 16s”串联在一起。 以下是我的开始。
昏暗的FileName作为字符串= File.ReadAllText(“ c:\ Fargo.pdr”)
TextBox1.Text = fileName

2 个答案:

答案 0 :(得分:0)

只是在行分隔符上拆分? 编辑以保持简单。将字符串变量分配给test

Dim s As String =
$"16,145,531299,S,531299,000,000,,36358906393192/ 
88,ACH CREDIT RECEIVED - Cust ID: AP0042168896 Desc: COMM OF ND Comp Name: 
88,COMM OF ND Comp ID: ND TR DPT SEC: CCD Cust Name: Dakota Central Date: 
88,11-28-18 Time: 05:52 AM Addenda: 705PA529-Kerry Adam 
16,145,520000,S,520000,000,000,,36358906393216/ 
88,ACH CREDIT RECEIVED - Cust ID: AP0042168908 Desc: COMM OF ND Comp Name: 
88,COMM OF ND Comp ID: ND TR DPT SEC: CCD Cust Name: Dakota Central Date: 
88,11-28-18 Time: 05:52 AM Addenda: 705PA529-Ladson Maria 
16,145,517500,S,517500,000,000,,36361011907140/ 
88,ACH CREDIT RECEIVED - Cust ID: 368908356002797 Desc: MERCH DEP Comp Name: 
88,BANKCARD Comp ID: 1246825337 SEC: CCD Cust Name: WRTI Date: 11-28-18 
88,Time: 05:36 AM Addenda: No Addenda"

您可能想清理文本。由你决定

s = s.Replace("88,", "") ' remove "88,"
s = s.Replace(Environment.NewLine, "") ' remove newlines (whatever newline char/string you use)

16,145上进行拆分的操作几乎没有那么简单。您只需一行代码即可收集您的收藏夹:

Dim lines = s.Split({"16,145,"}, StringSplitOptions.RemoveEmptyEntries)

产生的lines变量包含三行:

  

531299,S,531299,000,000,,36358906393192 /   
88,收到ACH信用-客户ID:AP0042168896描述:COMM OF ND比较名称:   
88,COMM OF ND公司ID:ND TR DPT SEC:CCD客户名称:Dakota Central日期:   
88,11-28-18时间:05:52 AM附录:705PA529-Kerry Adam   

520000,S,520000,000,000,,36358906393216 /   
88,收到ACH信用-客户ID:AP0042168908描述:COMM OF ND公司名称:   
88,COMM OF ND公司ID:ND TR DPT SEC:CCD客户名称:Dakota Central日期:   
88,11-28-18时间:05:52 AM附录:705PA529-Ladson Maria   

517500,S,517500,000,000,,36361011907140 /   
88,收到ACH信用-客户ID:368908356002797描述:MERCH DEP完整名称:   
88,BANKCARD Comp ID:1246825337 SEC:CCD Cust名称:WRTI日期:11-28-18   
88,时间:05:36 AM附录:无附录


现在,要解析记录本身,有几种方法可以做到这一点。最简单的方法(我将为您做的所有事情)是假设字段的顺序始终相同。让我们构造一个保存记录的结构:

Public Structure Record
    Public ReadOnly Property CustID As String
    Public ReadOnly Property Desc As String
    Public ReadOnly Property CompName As String
    Public ReadOnly Property CompID As String
    Public ReadOnly Property SEC As String
    Public ReadOnly Property CustName As String
    Public ReadOnly Property Timestamp As Date
    Public ReadOnly Property Addenda As String
    Public Sub New(custID As String, desc As String, compName As String,
                   compID As String, sec As String, custName As String,
                   [date] As String, time As String, addenda As String)
        Me.CustID = custID.Trim()
        Me.Desc = desc.Trim()
        Me.CompName = compName.Trim()
        Me.CompID = compID.Trim()
        Me.SEC = sec.Trim()
        Me.CustName = custName.Trim()
        Me.Addenda = addenda.Trim()
        Me.Timestamp = DateTime.Parse($"{[date].Trim()} {time.Trim()}")
    End Sub
End Class

它具有一个构造器,该构造器可以设置其属性并结合日期和时间。

现在,您可以使用以下代码创建记录列表:

Dim fieldNames = {"Cust ID:", "Desc:", "Comp Name:", "Comp ID:", "SEC:", "Cust Name:", "Date:", "Time:", "Addenda:"}
Dim records As New List(Of Record)()
For Each line In lines
    Dim values = line.Split(fieldNames, StringSplitOptions.RemoveEmptyEntries)
    records.Add(New Record(values(1), values(2), values(3), values(4), values(5), values(6), values(7), values(8), values(9)))
Next

检查调试器中的第一条记录:

enter image description here

请注意:最后似乎需要删除88,NewLine,否则您的值将与它们交错。

答案 1 :(得分:0)

我将使用正则表达式模式对“ 16,145”之间的每组内容进行迭代。

Dim regexPattern As String = "(?<=^16,145,)(.|\n)*?(?=(16,145,|\z))"
Dim regex As New Regex(regexPattern , RegexOptions.Multiline)
Dim matches As MatchCollection = regex.Matches(text)

If matches.Count > 0 Then
    For Each Match As Match In matches
        Dim content As String = Match.Value
    Next
End If

输出为3个匹配项,第一个为:

531299,S,531299,000,000,,36358906393192/
88,ACH CREDIT RECEIVED - Cust ID: AP0042168896 Desc: COMM OF ND Comp Name:
​88,COMM OF ND Comp ID: ND TR DPT SEC: CCD Cust Name: Dakota Central Date:
​88,11-28-18 Time: 05:52 AM Addenda: 705PA529-Kerry Adam