我正在寻找从字符串数组中搜索可以在字节数组中找到的所有巧合,并将这些巧合保存在文本文件中。
到目前为止,我已经加载了一个文件并将其数据转换为字节数组。我做了一个for循环,用我的字节数组的长度生成许多搜索。
byte[] test = System.IO.File.ReadAllBytes(openFileDialog1.FileName);
string hex = BitConverter.ToString(test).Replace("-", string.Empty);
for (int i = 0; i < hex.Length; i++) {
//String array with some of the patterns I'm looking for in the byte array
string[] patterns = { "05805A6C", "0580306C", "05801B6C" };
//I get the index if the pattern is found at i position
int indice = hex.IndexOf("05805A6C", i);
//Do some calculations to get the offset I desire to register
indice = indice + 8;
int index = (indice / 2);
//Transform the index into hexadecimal
string outputHex = int.Parse(index.ToString()).ToString("X");
//Output the index as an hexadecimal offset address
MessageBox.Show("0x" + outputHex);
// i gets the value of the indice and the loop starts again at this position
i = indice;
}
我的方法仅适用于查看一个模式。截至目前,我从文件“05805A6C”获取了所有偏移地址,但我的目标是从整个模式数组中进行完整搜索。
我怎样才能进行相同的搜索但是考虑字符串数组上的每个模式?
答案 0 :(得分:1)
我没有针对一整套测试用例运行此操作,但是......
public static class ByteArrayExtensions
{
public static int IndexOfAny(this byte[] source, byte[][] anyOf)
{
return IndexOfAny(source, anyOf, 0);
}
public static int IndexOfAny(this byte[] source, byte[][] anyOf, int startIndex)
{
var sanitisedAnyOf = new List<byte[]>(anyOf.Where(b => b != null && b.Length > 0 && b.Length <= source.Length));
if ( startIndex < 0 ) startIndex = 0;
for ( int i = startIndex ; i < source.Length ; ++ i )
{
var testByte = source[i];
// Check all the anyOf arrays to see if they start a new possible match, and could fit in the remaining data
for ( int anyOfIndex = 0 ; anyOfIndex < sanitisedAnyOf.Count ; ++ anyOfIndex )
{
if ( sanitisedAnyOf[anyOfIndex][0] == testByte && sanitisedAnyOf[anyOfIndex].Length + i <= source.Length )
{
// This is a possible match here, scan forwards to see if it is a complete match
int checkScanIndex;
for ( checkScanIndex = 0 ; checkScanIndex < sanitisedAnyOf[anyOfIndex].Length ; ++ checkScanIndex )
{
if ( source[i + checkScanIndex] != sanitisedAnyOf[anyOfIndex][checkScanIndex] )
{
// It didn't match
break;
}
}
if ( checkScanIndex == sanitisedAnyOf[anyOfIndex].Length )
{
// This completely matched
return i;
}
}
}
}
return -1;
}
}
测试代码:
void Test()
{
var anyOf = new byte[][]
{
new byte[] { 0xF4, 0xF0 },
new byte[] { 0x05, 0x80, 0x5A, 0x6C },
new byte[] { 0x05, 0x80, 0x30, 0x6C },
new byte[] { 0x05, 0x80, 0x1B, 0x6C },
new byte[] { 0x05, 0x05, 0x05, 0x6C },
new byte[] { },
new byte[1024]
};
var source = new byte[]
{
0xF4, 0xF0, 0x58, 0x05, 0xA6, 0xCD, 0x34, 0x05, 0x80, 0xF3, 0x67, 0x5C, 0x05, 0x80, 0x5A, 0x6C,
0x58, 0xBF, 0x05, 0x80, 0x5C, 0xFE, 0xB4, 0x8C, 0x05, 0x80, 0x30, 0x05, 0x80, 0x30, 0x6C, 0x77,
0x11, 0x70, 0x99, 0xD9, 0xAA, 0xCE, 0x95, 0xDF, 0x17, 0x11, 0x83, 0xCB, 0xF2, 0x0B, 0x73, 0xB8,
0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x6C, 0x5A, 0x78, 0x05, 0x80, 0x1B, 0x6C
};
var matchIndices = new List<int>();
int matchIndex = -1;
while ( ( matchIndex = source.IndexOfAny(anyOf, matchIndex + 1) ) >= 0 )
{
matchIndices.Add(matchIndex);
}
var output = string.Join(", ", matchIndices.Select(i => i.ToString()));
}
返回:
输出= 0,12,27,54,60
这种对字节数组的扩展方法添加了一个IndexOfAny()
方法,它接受字节数组并在源数组中查找匹配项。我相信这将解决原始问题,同时修复通过比较为十六进制引入的几个潜在问题。
我对字符串十六进制比较的问题是:
有关第二种情况的示例,请检查source[1]
至source[5]
,其中包含:
{ 0xF0, 0x58, 0x05, 0xA6, 0xCD }.AsHex() => "F05805A6CD"
其中十六进制将错误地匹配字节:
{ 0x05, 0x80, 0x5A, 0x6C }.AsHex() => "05805A6C"
我正在寻找一种更有效的方法,它可以处理来自流而不是字节数组的源数据。这意味着可以扫描更大的文件,因为它们不需要加载到内存中进行比较。我在尝试这个问题上遇到了一些问题,在稍后的数组中开始的短匹配优先于先前开始的较长匹配,但尚未完成比较。例如:
var source = new byte[] { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F };
var anyOf = new byte[][]
{
new byte[] { 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09 },
new byte[] { 0x05 }
};
将返回索引5处0x05处的匹配,而不是尚未完成比较的索引3处的正确匹配。
希望这有帮助
答案 1 :(得分:-1)
不确定我是否理解你的意图。但这就是我的想法
//String array with some of the patterns I'm looking for in the byte array
string[] patterns = { "05805A6C", "0580306C", "05801B6C" };
foreach (string p in patterns)
{
int i=0;
int indice = 0;
// teminate loop when no more occurrence is found;
// using a for loop with i++ is probably wrong since
// it skips one additional character after a found pattern
while (indice!=-1)
{
// index if the pattern is found AFTER i position, -1 if not
indice = hex.IndexOf(p, i);
//Do some calculations to get the offset I desire to register
i = indice+ 8; // skip the pattern occurrence itself
int index = (i / 2);
//Transform the index into hexadecimal
string outputHex = int.Parse(index.ToString()).ToString("X");
//Output the index as an hexadecimal offset address
MessageBox.Show("0x" + outputHex);
}
}
通过单独处理模式,您还可以获得更有序的输出。另外,您可以为单一模式搜索定义专用方法。
编辑:关于订购的问题(我想你的意思是从最大到最小的重新排序,对吗?),更改代码如此
//String array with some of the patterns I'm looking for in the byte array
string[] patterns = { "05805A6C", "0580306C", "05801B6C" };
foreach (string p in patterns)
{
List<int> allIndices = new List<int>();
int i=0;
int indice = 0;
// teminate loop when no more occurrence is found;
// using a for loop with i++ is probably wrong since
// it skips one additional character after a found pattern
while (indice!=-1)
{
// index if the pattern is found AFTER i position, -1 if not
indice = hex.IndexOf(p, i);
i = indice+ 8; // skip the pattern occurrence itself
// temporarily store the occured indices
if (indice != -1) allIndices.Add(i);
}
// does what it says :-)
allIndices.Reverse();
// separate loop for the output
foreach (int j in allIndices)
{
//Do some calculations to get the offset I desire to register
int index = (j / 2);
//Transform the index into hexadecimal
string outputHex = int.Parse(index.ToString()).ToString("X");
//Output the index as an hexadecimal offset address
MessageBox.Show("0x" + outputHex);
}
}