使用Perl从HEX文件中提取“纯文本”标题

时间:2011-05-10 13:24:29

标签: perl hex text-manipulation

有一个文件似乎有明文标题,我想提取并转换为纯文本。

使用HEXedit,这就是我所看到的,它位于文件中:

3a40 - 31 65 33 38 00 00 00 00 00 00 00 00 00 00 00 00 - 1e38............
3a50 - 00 00 00 00 00 00 00 00 00 00 0a 00 74 00 65 00 - ............t.e.
3a60 - 78 00 74 00 2f 00 61 00 73 00 63 00 69 00 69 00 - x.t./.a.s.c.i.i.
3a70 - 00 00 18 00 61 00 66 00 66 00 79 00 6d 00 65 00 - ....a.f.f.y.m.e
3a80 - 74 00 72 00 69 00 78 00 2d 00 61 00 72 00 72 00 - t.r.i.x.-.a.r.r
3a90 - 61 00 79 00 2d 00 62 00 61 00 72 00 63 00 6f 00 - a.y.-.b.a.r.c.o.
3aa0 - 64 00 65 00 00 00 64 00 40 00 35 00 32 00 30 00 - d.e...d.@.5.2.0.
3ab0 - 38 00 32 00 36 00 30 00 30 00 39 00 31 00 30 00 - 8.2.6.0.0.9.1.0.
3ac0 - 37 00 30 00 36 00 31 00 31 00 31 00 38 00 31 00 - 7.0.6.1.1.1.8.1.
3ad0 - 31 00 34 00 31 00 32 00 31 00 33 00 34 00 35 00 - 1.4.1.2.1.3.4.5.
3ae0 - 35 00 30 00 39 00 38 00 39 00 00 00 00 00 00 00 - 5.0.9.8.9.......
3af0 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - ................
3b00 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0a 00 - ................

这是我想要的输出:

text/ascii  affymetrix-array-barcode d@52082600910706111811412134550989

3 个答案:

答案 0 :(得分:1)

尝试使用iconv命令。这样的事情应该有效:

tail -c +6 input.txt | iconv -f UTF16 -t ASCII >output.txt

然后拆分空字节。

答案 1 :(得分:1)

当然,我不是,但如果您的所有文件看起来与您刚发布的文件非常相似,那么这就完成了工作:

use strict;
open FILE, 'file.dat';
binmode FILE;
my ($chunk, $buf, $n);
seek FILE, 28, 0;
while (($n=read FILE, $chunk, 16)) { $buf .= $chunk; }
my @s=split(/\0\0/, $buf, 4);
print "$s[0] $s[1] $s[2]\n";
close (FILE);

答案 2 :(得分:0)

perl解决方案可能很有趣,但是unix strings命令不会给你文件的明文部分吗?