替换二进制文件中的字节序列

时间:2011-06-29 18:48:11

标签: c# replace byte binaryfiles

将二进制文件中的字节序列替换为相同长度的其他字节的最佳方法是什么?二进制文件非常大,大约50 MB,不应该一次加载到内存中。

更新:我不知道需要替换的字节位置,我需要先找到它们。

4 个答案:

答案 0 :(得分:17)

假设您正在尝试替换文件的已知部分

  • 打开具有读/写访问权限的FileStream
  • 寻找合适的地方
  • 覆盖现有数据

示例代码......

public static void ReplaceData(string filename, int position, byte[] data)
{
    using (Stream stream = File.Open(filename, FileMode.Open))
    {
        stream.Position = position;
        stream.Write(data, 0, data.Length);
    }
}

如果您正在有效地尝试执行string.Replace的二进制版本(例如“总是用{20,35,15}替换字节{51,20,34},那么它会更加困难。你要做什么的描述:

  • 分配至少您感兴趣的数据大小的缓冲区
  • 反复读入缓冲区,扫描数据
  • 如果找到匹配项,请回到正确的位置(例如stream.Position -= buffer.Length - indexWithinBuffer;并覆盖数据

到目前为止听起来很简单......但是如果数据在缓冲区末尾附近启动,那么棘手的一点是。你需要记住所有潜在的匹配以及你到目前为止的匹配程度,这样如果你在阅读 next 缓冲区时获得匹配,你就可以检测到它

可能有办法避免这种诡计,但我不想尝试随便提出这些问题:)

编辑:好的,我有一个可能有帮助的想法......

  • 保留至少两倍于您需要的缓冲区
  • 反复:
    • 第二个缓冲区的一半复制到上半部分
    • 从文件
    • 填充缓冲区的后半部分
    • 在整个整个缓冲区中搜索您要查找的数据

在某种程度上,如果数据存在,它将完全在缓冲区内。

你需要注意流的位置才能回到正确的位置,但我认为这应该有效。如果你试图找到所有匹配会比较麻烦,但至少第一场比赛应该相当简单......

答案 1 :(得分:6)

我的解决方案:

    /// <summary>
    /// Copy data from a file to an other, replacing search term, ignoring case.
    /// </summary>
    /// <param name="originalFile"></param>
    /// <param name="outputFile"></param>
    /// <param name="searchTerm"></param>
    /// <param name="replaceTerm"></param>
    private static void ReplaceTextInBinaryFile(string originalFile, string outputFile, string searchTerm, string replaceTerm)
    {
        byte b;
        //UpperCase bytes to search
        byte[] searchBytes = Encoding.UTF8.GetBytes(searchTerm.ToUpper());
        //LowerCase bytes to search
        byte[] searchBytesLower = Encoding.UTF8.GetBytes(searchTerm.ToLower());
        //Temporary bytes during found loop
        byte[] bytesToAdd = new byte[searchBytes.Length];
        //Search length
        int searchBytesLength = searchBytes.Length;
        //First Upper char
        byte searchByte0 = searchBytes[0];
        //First Lower char
        byte searchByte0Lower = searchBytesLower[0];
        //Replace with bytes
        byte[] replaceBytes = Encoding.UTF8.GetBytes(replaceTerm);
        int counter = 0;
        using (FileStream inputStream = File.OpenRead(originalFile)) {
            //input length
            long srcLength = inputStream.Length;
            using (BinaryReader inputReader = new BinaryReader(inputStream)) {
                using (FileStream outputStream = File.OpenWrite(outputFile)) {
                    using (BinaryWriter outputWriter = new BinaryWriter(outputStream)) {
                        for (int nSrc = 0; nSrc < srcLength; ++nSrc)
                            //first byte
                            if ((b = inputReader.ReadByte()) == searchByte0
                                || b == searchByte0Lower) {
                                bytesToAdd[0] = b;
                                int nSearch = 1;
                                //next bytes
                                for (; nSearch < searchBytesLength; ++nSearch)
                                    //get byte, save it and test
                                    if ((b = bytesToAdd[nSearch] = inputReader.ReadByte()) != searchBytes[nSearch]
                                        && b != searchBytesLower[nSearch]) {
                                        break;//fail
                                    }
                                    //Avoid overflow. No need, in my case, because no chance to see searchTerm at the end.
                                    //else if (nSrc + nSearch >= srcLength)
                                    //    break;

                                if (nSearch == searchBytesLength) {
                                    //success
                                    ++counter;
                                    outputWriter.Write(replaceBytes);
                                    nSrc += nSearch - 1;
                                }
                                else {
                                    //failed, add saved bytes
                                    outputWriter.Write(bytesToAdd, 0, nSearch + 1);
                                    nSrc += nSearch;
                                }
                            }
                            else
                                outputWriter.Write(b);
                    }
                }
            }
        }
        Console.WriteLine("ReplaceTextInBinaryFile.counter = " + counter);
    }

答案 2 :(得分:4)

您可以使用我的 BinaryUtility 搜索并替换一个或多个字节,而无需将整个文件加载到内存中,如下所示:

var searchAndReplace = new List<Tuple<byte[], byte[]>>() 
{
    Tuple.Create(
        BitConverter.GetBytes((UInt32)0xDEADBEEF),
        BitConverter.GetBytes((UInt32)0x01234567)),
    Tuple.Create(
        BitConverter.GetBytes((UInt32)0xAABBCCDD),
        BitConverter.GetBytes((UInt16)0xAFFE)),
};
using(var reader =
    new BinaryReader(new FileStream(@"C:\temp\data.bin", FileMode.Open)))
{
    using(var writer =
        new BinaryWriter(new FileStream(@"C:\temp\result.bin", FileMode.Create)))
    {
        BinaryUtility.Replace(reader, writer, searchAndReplace);
    }
}

BinaryUtility 代码:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

public static class BinaryUtility
{
    public static IEnumerable<byte> GetByteStream(BinaryReader reader)
    {
        const int bufferSize = 1024;
        byte[] buffer;
        do
        {
            buffer = reader.ReadBytes(bufferSize);
            foreach (var d in buffer) { yield return d; }
        } while (bufferSize == buffer.Length);
    }

    public static void Replace(BinaryReader reader, BinaryWriter writer, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace)
    {
        foreach (byte d in Replace(GetByteStream(reader), searchAndReplace)) { writer.Write(d); }
    }

    public static IEnumerable<byte> Replace(IEnumerable<byte> source, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace)
    {
        foreach (var s in searchAndReplace)
        {
            source = Replace(source, s.Item1, s.Item2);
        }
        return source;
    }

    public static IEnumerable<byte> Replace(IEnumerable<byte> input, IEnumerable<byte> from, IEnumerable<byte> to)
    {
        var fromEnumerator = from.GetEnumerator();
        fromEnumerator.MoveNext();
        int match = 0;
        foreach (var data in input)
        {
            if (data == fromEnumerator.Current)
            {
                match++;
                if (fromEnumerator.MoveNext()) { continue; }
                foreach (byte d in to) { yield return d; }
                match = 0;
                fromEnumerator.Reset();
                fromEnumerator.MoveNext();
                continue;
            }
            if (0 != match)
            {
                foreach (byte d in from.Take(match)) { yield return d; }
                match = 0;
                fromEnumerator.Reset();
                fromEnumerator.MoveNext();
            }
            yield return data;
        }
        if (0 != match)
        {
            foreach (byte d in from.Take(match)) { yield return d; }
        }
    }
}

答案 3 :(得分:1)

    public static void BinaryReplace(string sourceFile, byte[] sourceSeq, string targetFile, byte[] targetSeq)
    {
        FileStream sourceStream = File.OpenRead(sourceFile);
        FileStream targetStream = File.Create(targetFile);

        try
        {
            int b;
            long foundSeqOffset = -1;
            int searchByteCursor = 0;

            while ((b=sourceStream.ReadByte()) != -1)
            {
                if (sourceSeq[searchByteCursor] == b)
                {
                    if (searchByteCursor == sourceSeq.Length - 1)
                    {
                        targetStream.Write(targetSeq, 0, targetSeq.Length);
                        searchByteCursor = 0;
                        foundSeqOffset = -1;
                    }
                    else 
                    {
                        if (searchByteCursor == 0)
                        {
                            foundSeqOffset = sourceStream.Position - 1;
                        }

                        ++searchByteCursor;
                    }
                }
                else
                {
                    if (searchByteCursor == 0)
                    {
                        targetStream.WriteByte((byte) b);
                    }
                    else
                    {
                        targetStream.WriteByte(sourceSeq[0]);
                        sourceStream.Position = foundSeqOffset + 1;
                        searchByteCursor = 0;
                        foundSeqOffset = -1;
                    }
                }
            }
        }
        finally
        {
            sourceStream.Dispose();
            targetStream.Dispose();
        }
    }