我正在寻找一种方法来检查数组是否包含另一个数组的所有元素。
情况就是这样:我有两个字节数组Bytes()
:一个包含文件的字节,另一个包含要比较的字节。
例如,如果文件包含这些字节:4D 5A 90 00 03
并且要比较的字符串是00 03
,我希望函数返回true。否则它显然会返回虚假。因此,要比较的字符串中的所有字节也必须存在于文件中。
我已经在网上搜索了这个。尝试使用旧的Contains()
函数,但对于数组,它仅用于比较单个字节。你知道,只有一个字节太少,无法识别文件!
如果可能的话,我想尽快做到这一点。
我正在使用 VB.NET WinForms ,VS 2013,.NET 4.5.1
提前致谢,
FWhite
修改
现在我有List(Of Bytes())
这样:
00 25 85 69
00 41 52
00 78 96 32
这是三个Bytes()数组。如何检查我的文件字节数组是否包含所有这些值(该文件必须包含00 25 85 69
,00 41 52
和00 78 96 32
?我试过这个代码,但它不起作用:
Dim BytesToCompare As List(Of Byte()) = StringToByteArray(S.Split(":")(3))
For Each B As Byte() In BytesToCompare
If FileBytes.All(Function(c) B.Contains(c)) Then
'Contains
TempResults.Add("T")
Else
TempResults.Add("F")
End If
Next
If CountResults(TempResults) Then
Return S
Exit For
End If
CountResults
中的代码是:
Public Function CountResults(Input As List(Of String)) As Boolean
Dim TrueCount As Integer = 0
Dim FalseCount As Integer = 0
Dim TotalCount As Integer = Input.Count
For Each S In Input
If S = "T" Then
TrueCount = TrueCount + 1
ElseIf S = "F" Then
FalseCount = FalseCount + 1
End If
Next
If TrueCount = TotalCount Then
Return True
ElseIf FalseCount > TrueCount Then
Return False
End If
End Function
告诉我你是否理解,我会尽力解释。
谢谢,
FWhite
答案 0 :(得分:1)
您可以使用All
功能来检查。它返回一个布尔值。
Dim orgByteArray() As Byte = {CByte(1), CByte(2), CByte(3)}
Dim testByteArray() As Byte = {CByte(1), CByte(2)}
Dim result = orgByteArray.All(Function(b) testByteArray.Contains(b))
'output for this case returns False
用于将List(Of Byte())
与Byte()
进行比较,其中Byte()
是List(Of byte())
中所有子数组的complte列表。
Dim filebytes() As Byte = {CByte(1), CByte(2), CByte(3), CByte(3), CByte(4), CByte(5), CByte(6), CByte(7), CByte(8)}
Dim bytesToCheck As New List(Of Byte())
bytesToCheck.Add(New Byte() {CByte(1), CByte(2), CByte(3)})
bytesToCheck.Add(New Byte() {CByte(3), CByte(4), CByte(5)})
bytesToCheck.Add(New Byte() {CByte(6), CByte(7), CByte(8)})
Dim temp As New List(Of Byte)
Array.ForEach(bytesToCheck.ToArray, Sub(byteArray) Array.ForEach(byteArray, Sub(_byte) temp.Add(_byte)))
Dim result = filebytes.All(Function(_byte) temp.Contains(_byte))
'output = True
答案 1 :(得分:1)
我在想,除了蛮力方法之外,还有其他方法可行,并发现了Boyer-Moore搜索算法。无耻地将Boyer–Moore string search algorithm中的C和Java代码翻译成VB.NET,我到达了
Public Class BoyerMooreSearch
' from C and Java code at http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
Private Shared Function SuffixLength(needle As Byte(), p As Integer) As Integer
Dim len As Integer = 0
Dim j = needle.Length - 1
Dim i = 0
While i >= 0 AndAlso needle(i) = needle(j)
i -= 1
j -= 1
len += 1
End While
Return len
End Function
Private Shared Function GetOffsetTable(needle As Byte()) As Integer()
Dim table(needle.Length - 1) As Integer
Dim lastPrefixPosition = needle.Length
For i = needle.Length - 1 To 0 Step -1
If Isprefix(needle, i + 1) Then
lastPrefixPosition = i + 1
End If
table(needle.Length - 1 - i) = lastPrefixPosition - i + needle.Length - 1
Next
For i = 0 To needle.Length - 2
Dim slen = SuffixLength(needle, i)
table(slen) = needle.Length - 1 - i + slen
Next
Return table
End Function
Private Shared Function Isprefix(needle As Byte(), p As Integer) As Boolean
Dim j = 0
For i = p To needle.Length - 1
If needle(i) <> needle(j) Then
Return False
End If
j += 1
Next
Return True
End Function
Private Shared Function GetCharTable(needle As Byte()) As Integer()
Const ALPHABET_SIZE As Integer = 256
Dim table(ALPHABET_SIZE - 1) As Integer
For i = 0 To table.Length - 1
table(i) = needle.Length
Next
For i = 0 To needle.Length - 2
table(needle(i)) = needle.Length - 1 - i
Next
Return table
End Function
Shared Function IndexOf(haystack As Byte(), needle As Byte()) As Integer
If needle.Length = 0 Then
Return 0
End If
Dim charTable = GetCharTable(needle)
Dim offsetTable = GetOffsetTable(needle)
Dim i = needle.Length - 1
While i < haystack.Length
Dim j = needle.Length - 1
While j >= 0 AndAlso haystack(i) = needle(j)
i -= 1
j -= 1
End While
If j < 0 Then
Return i + 1
End If
i += Math.Max(offsetTable(needle.Length - 1 - j), charTable(haystack(i)))
End While
Return -1
End Function
End Class
并测试它(怀疑@OneFineDay提供的LINQ代码会因性能而拆除它):
Imports System.IO
Imports System.Text
Module Module1
Dim bytesToCheck As List(Of Byte())
Dim rand As New Random
Function GetTestByteArrays() As List(Of Byte())
Dim testBytes As New List(Of Byte())
' N.B. adjust the numbers used in CreateTestFile according to the quantity (e.g. 10) of testData used
For i = 1 To 10
testBytes.Add(Encoding.ASCII.GetBytes("ABCDEFgfdhgf" & i.ToString() & "sdfgjdfjFGH"))
Next
Return testBytes
End Function
Sub CreateTestFile(f As String)
' Make a 4MB file of test data
' write a load of bytes which are not going to be in the
' judiciously chosen data to search for...
Using fs As New FileStream(f, FileMode.Create, FileAccess.Write)
For i = 0 To 2 ^ 22 - 1
fs.WriteByte(CByte(rand.Next(128, 256)))
Next
End Using
' ... and put the known data into the test data
Using fs As New FileStream(f, FileMode.Open)
For i = 0 To bytesToCheck.Count - 1
fs.Position = CLng(i * 2 ^ 18)
fs.Write(bytesToCheck(i), 0, bytesToCheck(i).Length)
Next
End Using
End Sub
Sub Main()
' the byte sequences to be searched for
bytesToCheckFor = GetTestByteArrays()
' Make a test file so that the data can be inspected
Dim testFileName As String = "C:\temp\testbytes.dat"
CreateTestFile(testFileName)
Dim fileData = File.ReadAllBytes(testFileName)
Dim sw As New Stopwatch
Dim containsP As Boolean = True
sw.Reset()
sw.Start()
For i = 0 To bytesToCheckFor.Count - 1
If BoyerMooreSearch.IndexOf(fileData, bytesToCheckFor(i)) = -1 Then
containsP = False
Exit For
End If
Next
sw.Stop()
Console.WriteLine("Boyer-Moore: {0} in {1}", containsP, sw.ElapsedTicks)
sw.Reset()
sw.Start()
Dim temp As New List(Of Byte)
Array.ForEach(bytesToCheckFor.ToArray, Sub(byteArray) Array.ForEach(byteArray, Sub(_byte) temp.Add(_byte)))
Dim result = fileData.All(Function(_byte) temp.Contains(_byte))
sw.Stop()
Console.WriteLine("LINQ: {0} in {1}", result, sw.ElapsedTicks)
Console.ReadLine()
End Sub
End Module
现在,我知道要匹配的字节序列在测试文件中(我确认通过使用十六进制编辑器来搜索它们)并且,假设(哦亲爱的!)我正在使用正确的另一种方法,后者不起作用,而我的确如此:
Boyer-Moore: True in 23913
LINQ: False in 3224
我还测试了OneFineDay的第一个代码示例,用于搜索要匹配的小型和大型模式,而少于七个或八个字节的代码比Boyer-Moore快。因此,如果你想关注你正在搜索的数据大小和你正在寻找的模式的大小,Boyer-Moore可能更适合你的“如果可能的话,我想尽可能快地做到这一点。“
修改强>
除了OP对我的建议方法是否有效的不确定性之外,这里是一个非常小的样本数据的测试:
Sub Test()
bytesToCheckFor = New List(Of Byte())
bytesToCheckFor.Add({0, 1}) ' the data to search for
bytesToCheckFor.Add({1, 2})
bytesToCheckFor.Add({0, 2})
Dim fileData As Byte() = {0, 1, 2} ' the file data
' METHOD 1: Boyer-Moore
Dim containsP As Boolean = True
For i = 0 To bytesToCheckFor.Count - 1
If BoyerMooreSearch.IndexOf(fileData, bytesToCheckFor(i)) = -1 Then
containsP = False
Exit For
End If
Next
Console.WriteLine("Boyer-Moore: {0}", containsP)
' METHOD 2: LINQ
Dim temp As New List(Of Byte)
Array.ForEach(bytesToCheckFor.ToArray, Sub(byteArray) Array.ForEach(byteArray, Sub(_byte) temp.Add(_byte)))
Dim result = fileData.All(Function(_byte) temp.Contains(_byte))
Console.WriteLine("LINQ: {0}", result)
Console.ReadLine()
End Sub
输出:
Boyer-Moore: False
LINQ: True
另外,我在原始Main()方法中重命名了变量,希望它们更有意义。