我有此代码:
def createDiff(newFile, oldFile):
diff = {}
with open(oldFile, "rb") as oldFile, open(newFile, "rb") as newFile:
oldBytes = bytearray(oldFile.read())
newBytes = bytearray(newFile.read())
i=0
for byte in oldBytes:
if len(oldBytes) < i:
if len(newBytes) < i:
break
else:
diff[i] = ["-", byte]
i += 1
continue
if len(newBytes) < i:
if len(oldBytes) < i:
break
else:
diff[i] = ["+", byte]
i += 1
continue
print(i)
if byte == newBytes[i]:
pass
else:
diff[i] = ["+", newBytes[i]]
diff[i] = ["+", byte]
i += 1
return lzma.compress(json.dumps(diff).encode())
此函数接受2个输入(均为二进制文件的路径),并尝试在两者之间创建差异。现在,这对于小型可执行文件非常有用,但是较大的二进制文件将永远存在。有没有一种更快的方法可以遍历尚未找到的字节数组?还是我的算法有缺陷?也许我应该将字节数组分成较小的块并分析每个块?提前感谢您的帮助!