我正在尝试创建一个程序来验证文件B(可能是坏的)对文件A(已知的好)的内容,并从潜在的坏文件中删除每个已知良好的行,并只留下潜在的坏行。我遇到的问题是每行包含一个时间戳。如何在时间戳之后验证行的内容?
即。文件A:
MSI (c) (74:80) [08:09:43:718]: Resetting cached policy values
MSI (c) (74:80) [08:09:43:718]: Machine policy value 'Debug' is 0
MSI (c) (74:80) [08:09:43:718]: ******* RunEngine:
与文件B对比:
MSI (c) (E8:DC) [18:35:18:573]: Resetting cached policy values
MSI (c) (E8:DC) [18:35:18:573]: Machine policy value 'Debug' is 0
MSI (c) (E8:DC) [18:35:18:573]: ******* RunEngine:
这些都应该被认为是平等的。 我没有一个不同的例子,但它本质上是一旦被删除就留下的任何东西。
到目前为止我的代码:
Public Class Form1
Dim compto As New List(Of String)
Dim compfrom As New List(Of String)
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Standard("filea.LOG")
Readfile("fileb.LOG")
Writefile("difference.txt")
End Sub
Public Sub Standard(ByVal Path As String)
Using r As StreamReader = New StreamReader(Path)
Dim line As String = Nothing
line = r.ReadLine
Do While (Not line Is Nothing)
line = r.ReadLine
If Not compto.Contains(line) Then compto.Add(line)
Loop
End Using
End Sub
Public Sub Readfile(ByVal Path As String)
Dim pattern As String = "{30}([A-Za-z0-9\-]+"
Using r As StreamReader = New StreamReader(Path)
Dim line As String = Nothing
line = r.ReadLine
Do While (Not line Is Nothing)
line = r.ReadLine
If Not compto.Contains(line) Then compfrom.Add(line)
Loop
End Using
End Sub
Public Sub Writefile(ByVal Path As String)
Using sw As StreamWriter = New StreamWriter(Path)
For Each line As String In compfrom
sw.WriteLine(line)
ListBox1.Items.Add(line)
Next
End Using
End Sub
End Class
到目前为止,此代码将删除完全匹配,但这是我被卡住的地方。任何帮助都会很棒。
感谢。
解决方案编辑:
Public Sub Writefile(ByVal Path As String)
Dim GetLine As Func(Of String, String) = Function(line) Regex.Match(line, "\]: (.*)").Groups(1).Value
Dim Diff As New HashSet(Of String)(File.ReadLines("filea.log").Select(GetLine))
Diff.SymmetricExceptWith(File.ReadLines("fileb.log").Select(GetLine))
Using sw As StreamWriter = New StreamWriter(Path)
For Each line As String In Diff
sw.WriteLine(String.Join("", line))
ListBox1.Items.Add(String.Join("", line))
Next
End Using
End Sub
答案 0 :(得分:2)
根据this链接,试试这个:
Dim GetLine As Func(Of String,String) = Function(line) Regex.Match(line,"\]: (.*)").Groups(1).Value
'IF the timestamp is always at the same position, it may be more efficient to
'avoid regular expressions. YMMV
GetLine = Function(line) line.Substring(32)
Dim Diff = New HashSet(File.ReadLines("filea.LOG").Select(GetLine))
Diff.SymmetricExceptWith(File.ReadLines("fileb.LOG").Select(GetLine))
答案 1 :(得分:1)
您似乎正在将File A
中的每个唯一行与File B
中的每一行进行比较,并且行标题MSI (c) (74:80) [08:09:43:718]:
与此比较无关,并且它是恒定长度。
您可以更改代码(4个实例):
line = r.ReadLine
为:
line = r.ReadLine.Substring(32)
带有一个整数参数的 Substring()
返回从指定字符位置开始的字符串的剩余部分。