C#迭代二进制文件并构建一个包含找到的字节的文本文件

时间:2015-10-10 16:30:14

标签: c# binary filestream binaryreader

我试着更具体。

我有一个二进制文件,里面有一些文本部分。 我想在二进制文件中搜索一些字节序列,如果找到序列,则使用字节数组并用它们构建文本文件。

因此必须重复该步骤,直到二进制文件结束。 我使用BinaryReader来搜索字节序列,以便验证二进制文件,但现在我仍然坚持如何继续这种组合。

我的另一个问题是我必须跳过二进制文件的某些部分,直到找到下一个序列。

例如,我发现第一个序列位于0x10,并且持续10个字节。然后我必须跳过32个字节,然后另一个字节序列从x字节开始,直到标记序列结束的尾部字节。

每次找到字节序列时,我都要将其保存在文本文件中,最后将其写入磁盘。

任何帮助?

2 个答案:

答案 0 :(得分:0)

这样的事情,然后:

class Program
{
    const string filename = "some file";

    static void Main(string[] args)
    {
        byte[] bytes = System.IO.File.ReadAllBytes(filename);

        string[] find = new string[] { "me", "you" };

        int offsetAfterFind = 32;

        int pos = 0;

        while (pos < bytes.Length)
        {
            bool isFound = false;
            int index = 0;
            while (!isFound && index < find.Length)
            {
                bool isMatch = true;
                for (int n = 0; n < find[index].Length; n++)
                {
                    if (pos + n >= bytes.Length)
                    {
                        isMatch = false;
                    }
                    else
                    {
                        if (bytes[pos + n] != find[index][n]) isMatch = false;
                    }
                }
                if (isMatch)
                {
                    isFound = true;
                    break;
                }
                index++;
            }
            if (isFound)
            {
                Console.WriteLine(String.Format("Found {0} at {1}", find[index], pos));
                pos += find[index].Length + offsetAfterFind;
            }
            else
            {
                pos++;
            }
        }

    }
}

答案 1 :(得分:0)

好的。我设法做到了,也许这对其他人有用:

public static void ConvertToSRTSubs()
    {
        byte [] openingTimeWindow = Encoding.ASCII.GetBytes("["); \\Timespan in the binary is wrapped around square brackets
        byte [] nextOpening = Encoding.ASCII.GetBytes("[00"); \\ I need this as a point to get the end of the sentence, because there is a fixed size between sentences and next timespan.
        byte [] closingTimeWindow = Encoding.ASCII.GetBytes("]"); \\End of the timespan
        int found = 0;  \\This will iterate through every timespan match
        int backPos = 0; \\Pointer to the first occurrence
        int nextPos = 0;
        int sentenceStartPos = 0;
        int newStartFound = 0;
        string srtTime = String.Empty;
        string srtSentence = String.Empty;

        byte[] array = File.ReadAllBytes(Path.Combine(coursePath, hashedSubFileName));
        try
        {
            using (StreamWriter s = new StreamWriter(Video.outPath + ext, false))
            {
                for (int i = 0; i < array.Length; i++)
                {
                    if (openingTimeWindow[0] == array[i] && closingTimeWindow[0] == array[i + 12])
                    {
                        found++;
                        s.WriteLine(found);                            
                        try
                        {
                            backPos = i;
                            for (i = backPos + 12; i < array.Length; i++ )
                                {
                                    if (newStartFound == 1) 
                                        break;
                                    if (nextOpening[0] == array[i] && nextOpening[1] == array[i + 1] && nextOpening[2] == array[i + 2])
                                    {
                                        nextPos = i - 19;
                                        newStartFound++;
                                    }
                                }
                            i = backPos;
                            newStartFound = 0;
                            sentenceStartPos = backPos + 27;
                            sentenceSize = nextPos - sentenceStartPos;
                            if (sentenceSize < 0) sentenceSize = 1;
                            byte[] startTime = new byte[11];
                            byte[] sentence = new byte[sentenceSize];
                            Array.Copy(array, backPos + 1, startTime, 0, 11);
                            Array.Copy(array, sentenceStartPos, sentence, 0, sentenceSize);
                            srtTimeRaw = srtTime = Encoding.UTF8.GetString(startTime);
                            srtTime = srtTimeRaw.Replace('.', ',') + "0" + " --> " + span;
                            s.WriteLine(srtTime);
                            srtSentence = Encoding.UTF8.GetString(sentence);
                            s.WriteLine(srtSentence);
                            s.WriteLine();
                        }
                        catch (ArgumentException argex)
                        {
                            MessageBox.Show(argex.ToString());
                        }

                    }
                }
            }
        }
        catch (DirectoryNotFoundException dex)
        {
            MessageBox.Show(dex.ToString());
        }
    }

也许不是最干净的代码,但它有效:)