python re.findall在hexdump文件中解析签名数据之间的数据

时间:2016-12-13 10:22:43

标签: python

您正在尝试解析来自hexdump文件'data.dat'的数据,其中包含数据

0ff0a33a3aa3f00f00000000000000280ff0a33a3aa3f00f00000000010000283132333405000600070000000ff0a33a3aa3f00f000000000200002801000000020000000ff0a33a3aa3f00f00000000030000283100020033000000040000000ff0a33a3aa3f00f0000000004000028010000003200020001000000320000000ff0a33a3aa3f00f00000000000000300ff0a33a3aa3f00f00000000010000303132333405000600070000000ff0a33a3aa3f00f000000000200003001000000020000000ff0a33a3aa3f00f00000000030000303100020033000000040000000ff0a33a3aa3f00f000000000400003001000000320002000100000032000000

其中'0ff0a33a3aa3f00f'是签名,我需要在每个签名之间提取数据 结果应该是: 0000000000000028 0000000001000028313233340500060007000000 00000000020000280100000002000000 0000000003000028310002003300000004000000 ......等

import binascii
import re
fo = open ('data.dat','rb+')
content = binascii.hexlify(fo.read())
match_object = re.findall(r'0ff0a33a3aa3f00f(\w*?)0ff0a33a3aa3f00f', content,re.M|re.I)
print match_object

但是在每个替代签名后,这里都会丢失数据

如何不包括比赛中的后方签名

2 个答案:

答案 0 :(得分:3)

您可以简单地拆分内容:

MySimpleJob.ps1

结果:

content.split("0ff0a33a3aa3f00f")

答案 1 :(得分:0)

您需要使用forward lookahead或者您将使用尾随签名,因此一个数据包中有两个(最后一个签名被使用):

match_object = re.findall(r'0ff0a33a3aa3f00(\w*?)(?=0ff0a33a3aa3f00)', content,re.M|re.I)
print(match_object)

结果:

['f0000000000000028', 'f0000000001000028313233340500060007000000', 
 'f00000000020000280100000002000000',
 'f0000000003000028310002003300000004000000',
 'f000000000400002801000000320002000100000032000000',
 'f0000000000000030', 'f0000000001000030313233340500060007000000',
 'f00000000020000300100000002000000',
 'f0000000003000030310002003300000004000000']

请注意,未提取未由签名包装的最后一个数据。如果您需要这样做:

re.findall(r'0ff0a33a3aa3f00(\w*?)(?=0ff0a33a3aa3f00|$)', content,re.M|re.I)

(尾随签名或文字结尾)