我有一个包含如下所示数据的文本文件:
ACK
DATA1 < >
ACK
DATA1 < >
NAK
ACK
DATA1 < >
DATA0 < 20 >
ACK
DATA1 < 01 01 01 00 >
ACK
ACK
DATA1 < >
DATA1 < 20 >
ACK
DATA1 < >
ACK
ACK
ACK
ACK
ACK
ACK
ACK
ACK
DATA0 < 00 00 00 00 ff ff ff ff 00 00 00 01 ff ff ff fe 00 00 00 02 ff ff ff fd 00 00 00 03 ff ff ff fc
00 00 00 08 ff ff ff f7 00 00 00 09 ff ff ff f6 00 00 00 0a ff ff ff f5 00 00 00 0b ff ff ff f4
00 00 00 10 ff ff ff ef 00 00 00 11 ff ff ff ee 00 00 00 12 ff ff ff ed 00 00 00 13 ff ff ff ec
00 00 00 18 ff ff ff e7 00 00 00 19 ff ff ff e6 00 00 00 1a ff ff ff e5 00 00 00 1b ff ff ff e4
00 00 00 20 ff ff ff df 00 00 00 21 ff ff ff de 00 00 00 22 ff ff ff dd 00 00 00 23 ff ff ff dc
00 00 00 28 ff ff ff d7 00 00 00 29 ff ff ff d6 00 00 00 2a ff ff ff d5 00 00 00 2b ff ff ff d4
00 00 00 30 ff ff ff cf 00 00 00 31 ff ff ff ce 00 00 00 32 ff ff ff cd 00 00 00 33 ff ff ff cc
00 00 00 38 ff ff ff c7 00 00 00 39 ff ff ff c6 00 00 00 3a ff ff ff c5 00 00 00 3b ff ff ff c4
00 00 00 40 ff ff ff bf 00 00 00 41 ff ff ff be 00 00 00 42 ff ff ff bd 00 00 00 43 ff ff ff bc
00 00 00 48 ff ff ff b7 00 00 00 49 ff ff ff b6 00 00 00 4a ff ff ff b5 00 00 00 4b ff ff ff b4
00 00 00 50 ff ff ff af 00 00 00 51 ff ff ff ae 00 00 00 52 ff ff ff ad 00 00 00 53 ff ff ff ac
00 00 00 58 ff ff ff a7 00 00 00 59 ff ff ff a6 00 00 00 5a ff ff ff a5 00 00 00 5b ff ff ff a4
00 00 00 60 ff ff ff 9f 00 00 00 61 ff ff ff 9e 00 00 00 62 ff ff ff 9d 00 00 00 63 ff ff ff 9c
00 00 00 68 ff ff ff 97 00 00 00 69 ff ff ff 96 00 00 00 6a ff ff ff 95 00 00 00 6b ff ff ff 94
00 00 00 70 ff ff ff 8f 00 00 00 71 ff ff ff 8e 00 00 00 72 ff ff ff 8d 00 00 00 73 ff ff ff 8c
00 00 00 78 ff ff ff 87 00 00 00 79 ff ff ff 86 00 00 00 7a ff ff ff 85 00 00 00 7b ff ff ff 84 >
DATA1 < 01 01 01 01 fe fe fe fe 00 00 01 00 ff ff fe ff 00 00 02 00 ff ff fd ff 00 00 03 00 ff ff fc ff
00 00 08 00 ff ff f7 ff 00 00 09 00 ff ff f6 ff 00 00 0a 00 ff ff f5 ff 00 00 0b 00 ff ff f4 ff
00 00 10 00 ff ff ef ff 00 00 11 00 ff ff ee ff 00 00 12 00 ff ff ed ff 00 00 13 00 ff ff ec ff
00 00 18 00 ff ff e7 ff 00 00 19 00 ff ff e6 ff 00 00 1a 00 ff ff e5 ff 00 00 1b 00 ff ff e4 ff
00 00 20 00 ff ff df ff 00 00 21 00 ff ff de ff 00 00 22 00 ff ff dd ff 00 00 23 00 ff ff dc ff
00 00 28 00 ff ff d7 ff 00 00 29 00 ff ff d6 ff 00 00 2a 00 ff ff d5 ff 00 00 2b 00 ff ff d4 ff
00 00 30 00 ff ff cf ff 00 00 31 00 ff ff ce ff 00 00 32 00 ff ff cd ff 00 00 33 00 ff ff cc ff
00 00 38 00 ff ff c7 ff 00 00 39 00 ff ff c6 ff 00 00 3a 00 ff ff c5 ff 00 00 3b 00 ff ff c4 ff
00 00 40 00 ff ff bf ff 00 00 41 00 ff ff be ff 00 00 42 00 ff ff bd ff 00 00 43 00 ff ff bc ff
00 00 48 00 ff ff b7 ff 00 00 49 00 ff ff b6 ff 00 00 4a 00 ff ff b5 ff 00 00 4b 00 ff ff b4 ff
00 00 50 00 ff ff af ff 00 00 51 00 ff ff ae ff 00 00 52 00 ff ff ad ff 00 00 53 00 ff ff ac ff
00 00 58 00 ff ff a7 ff 00 00 59 00 ff ff a6 ff 00 00 5a 00 ff ff a5 ff 00 00 5b 00 ff ff a4 ff
00 00 60 00 ff ff 9f ff 00 00 61 00 ff ff 9e ff 00 00 62 00 ff ff 9d ff 00 00 63 00 ff ff 9c ff
00 00 68 00 ff ff 97 ff 00 00 69 00 ff ff 96 ff 00 00 6a 00 ff ff 95 ff 00 00 6b 00 ff ff 94 ff
00 00 70 00 ff ff 8f ff 00 00 71 00 ff ff 8e ff 00 00 72 00 ff ff 8d ff 00 00 73 00 ff ff 8c ff
00 00 78 00 ff ff 87 ff 00 00 79 00 ff ff 86 ff 00 00 7a 00 ff ff 85 ff 00 00 7b 00 ff ff 84 ff >
此数据是部分USB流量日志,将用作比较由C程序即时生成的数据的黄金标准,不幸的是,黄金标准发生了变化,我希望能够灵活地生成新的来自交通日志的结构。
换句话说,我想用Python来生成我将在我的C程序中使用的结构。我需要将此数据转换为包含转换为等效十六进制值(ACK = 0xD2
,DATA1 = 0x4B
等)和数据(<01 01 01>
)的结构的结构。
我最挣扎的部分是数据是多行时,例如:
DATA0 < 00 00 00 00...ff ff ff fc
00 00 00 00...ff ff ff f4
....
00 00 00 00...ff ff ff 84 >
我还没有找到一种方法来连接这些行并将它们放在它们自己的行中,如下所示:
DATA0 < 00 00 00 00...ff ff ff 84 >
一旦数据在一行中,我知道我可以使用split()
方法来提取感兴趣的部分。
答案 0 :(得分:1)
这可能是一种更流畅的方式,但是如果您的数据位于&data; .txt&#39;
中,那么就可以做到这一点。with open('data.txt', 'rt') as fobj:
lines = []
in_data_line = False
for line in fobj:
line = line.rstrip('\n')
lines.append(line)
if not in_data_line and line.startswith('DATA') and not line.endswith('>'):
in_data_line = True
if in_data_line and line.endswith('>'):
in_data_line = False
if not in_data_line:
lines.append('\n')
# lines now has DATA lines joined
print(''.join(lines))
答案 1 :(得分:0)
我是你,这就是我要做的。放入那些多行数据后,用空格替换行开头的双制表。然后连接(或加入)所有这些。
答案 2 :(得分:0)
只是一个骨架。它不会连接行,它会将整个文本拆分为单词,然后在尖括号之间重建数据列表。我希望结果数据易于处理。
def lex(file):
in_data = False
with open(file) as infile:
for line in infile:
for word in line.split():
if not in_data:
if word == '<':
data_list = []
in_data = True
else:
# process ACK, NAK, DATA, ....
yield word
else:
if word == '>':
in_data = False
yield data_list
else:
data_list.append(int(word, 16))
print(list(lex('data.txt')))
输出(缩短):
['ACK','DATA1',[],'ACK','DATA1',[],'NAK','ACK','DATA1',[], 'DATA0',[32],'ACK','DATA1',[1,1,1,0],'ACK','ACK','DATA1', [],'DATA1',[32],'ACK','DATA1',[],'ACK','ACK','ACK','ACK', 'ACK','ACK','ACK','ACK','DATA0',[0,0,0,0,255,255,255,255, 0,0,0,1,255,255,255,254,0,0,0,2,255,255,255,253,0,0, 0,3,255,255,255,252,0,0,0,8,255,255,255,247,0,0,0,9, 255,255,255,246,0,0,0,105,255,255,245,0,0,0,11,255, 255,255,244,0,0,0,16,255,...... 255]]