我正在构建一个从二进制数据库文件中提取数据的工具。我可以处理固定宽度的未压缩数据,但是下一个尝试使用的压缩方法是文件。
根据TRiD(http://mark0.net/soft-trid-e.html),该文件与MP3匹配100%,这是值得的,这是文件的前几个字节:
FF FF AD 00 C0 7E AA 00 21 C0 AD 00 AE 02 00 00 00 00 00 00 0C 86 00
00 52 81 00 00 00 00 00 00 CC 01 00 00 42 54 47 42 91 00 00 00 57 01 00 40
32 00 30 00 34 12 EE 1F 00 00 01 00 00 02 01 04 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 54 1C 73 5A 00 00 49 02 FF 00
02 00 AF 89 00 00 00 00 00 00 6B 04 00 00 56 1C 73 5A B3 06 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00
我的应用程序通常会从文件中的任何位置获取一个二进制数据块,并将其保存到一个小文件中,该文件很容易进行快速操作,直到所有数据对齐为止。
如果要对压缩文件进行此操作,是否需要先对整个文件进行解压缩,然后仅保存一部分,或者我可以告诉它抓取文件的前30k,对找到的内容进行解压缩并写入?那出来一个文件?我知道会有很多额外的unicode /控制字符,但是只要字符串值是可打印的,我就会弄清楚其余的。
public static FileStream stream = new FileStream(@"DB.dat", FileMode.Open, FileAccess.Read);
public static FileStream shortFile = null;
int limit = 30000;
public MainWindow()
{
byte[] block = new byte[limit];
using (FileStream fs = File.Create("tempfile.dat"))
{
stream.Position = 0;
stream.Read(block, 0, limit);
fs.Write(block, 0, block.Length);
}
InitializeComponent();
}
编辑:添加示例文件http://64.72.211.216/ZIPCODE.dat的ID信息
TrID/32 - File Identifier v2.24 - (C) 2003-16 By M.Pontello
Definitions found: 10674
Analyzing...
Collecting data from file: ZIPCODE.dat
100.0% (.MP3) MP3 audio (1000/1)
Mime type : audio/mpeg3
Definition : audio-mp3.trid.xml
Files : 34
Author : Marco Pontello
E-Mail : marcopon@gmail.com
Home Page : http://mark0.net