我希望我的解析器返回一个字符串列表,但它返回一个空白列表

时间:2013-08-15 15:10:00

标签: python list parsing hex

我有一个读取长八位字符串的解析器,我希望它根据解析细节打印出较小的字符串。它读入一个hexstring,如下所示

字符串的格式如下:

01046574683001000004677265300000000266010000

十六进制中包含的接口格式如下:

version:length_of_name:name:op_status:priority:reserved_byte

==

01:04:65746830:01:00:00

==(从十六进制转换时)

01:04:eth0:01:00:00 

^这是字符串的1段,代表eth0(我插入了:使其更容易阅读)。但是,在那一刻,我的代码返回一个空白列表,我不知道为什么。请有人帮帮我!

def octetChop(long_hexstring, from_ssh_):
    startpoint_of_interface_def=0
    # As of 14/8/13 , the network operator has not been implemented
    network_operator_implemented=False
    version_has_been_read = False
    position_of_interface=0
    chopped_octet_list = []

#This while loop moves through the string of the interface, based on the full length of the container
    try:
        while startpoint_of_interface_def < len(long_hexstring):

            if version_has_been_read == True:
                pass
            else:
                if startpoint_of_interface_def == 0:
                    startpoint_of_interface_def = startpoint_of_interface_def + 2
                    version_has_been_read = True

            endpoint_of_interface_def = startpoint_of_interface_def+2
            length_of_interface_name = long_hexstring[startpoint_of_interface_def:endpoint_of_interface_def]
            length_of_interface_name_in_bytes = int(length_of_interface_name) * 2 #multiply by 2 because its calculating bytes

            end_of_interface_name_point = endpoint_of_interface_def + length_of_interface_name_in_bytes
            hex_name = long_hexstring[endpoint_of_interface_def:end_of_interface_name_point]
            text_name = hex_name.decode("hex")

            print "the text_name is " + text_name

            operational_status_hex = long_hexstring[end_of_interface_name_point:end_of_interface_name_point+2]

            startpoint_of_priority = end_of_interface_name_point+2
            priority_hex = long_hexstring[startpoint_of_priority:startpoint_of_priority+2]

            #Skip the reserved byte
            network_operator_length_startpoint = startpoint_of_priority+4

            single_interface_string = long_hexstring[startpoint_of_interface_def:startpoint_of_priority+4]
            print single_interface_string + " is chopped from the octet string"# - keep for possible debugging

            startpoint_of_interface_def = startpoint_of_priority+4

            if network_operator_implemented == True:
                network_operator_length = long_hexstring[network_operator_length_startpoint:network_operator_length_startpoint+2]
                network_operator_length = int(network_operator_length) * 2
                network_operator_start_point = network_operator_length_startpoint+2
                network_operator_end_point = network_operator_start_point + network_operator_length
                network_operator = long_hexstring[network_operator_start_point:network_operator_end_point]
                #
                single_interface_string = long_hexstring[startpoint_of_interface_def:network_operator_end_point]

                #set the next startpoint if there is one
                startpoint_of_interface_def = network_operator_end_point+1
            else:
                self.network_operator = None

            print single_interface_string + " is chopped from the octet string"# - keep for possible debugging

            #This is where each individual interface is stored, in a list for comparison.
            chopped_octet_list.append(single_interface_string)
    finally:

        return chopped_octet_list

5 个答案:

答案 0 :(得分:1)

我希望我帮到你。你有一个包含各种接口定义的十六进制字符串。在每个接口定义中,第二个八位字节描述接口名称的长度。

假设字符串包含接口eth0和eth01,看起来像这样(eth0的长度为4,eth01的长度为5):

01046574683001000001056574683031010000

然后你可以像这样分开它:

def splitIt (s):
    tokens = []
    while s:
        length = int (s [2:4], 16) * 2 + 10 #name length * 2 + 10 digits for rest
        tokens.append (s [:length] )
        s = s [length:]
    return tokens

这会产生:

['010465746830010000', '01056574683031010000']

答案 1 :(得分:1)

您的代码返回空白列表的原因如下:在此行中:

    else:
        self.network_operator = None

self未定义,因此您获得 NameError 异常。这意味着try直接跳转到finally子句而不执行您所在的部分:

chopped_octet_list.append(single_interface_string)

结果列表仍然是空的。无论如何,代码对于这样的任务来说过于复杂,我会遵循其他一个答案。

答案 2 :(得分:0)

要添加到Hyperboreus的答案,这是分割界面字符串时解析界面字符串的简单方法:

def parse(s):
    version = int(s[:2], 16)
    name_len = int(s[2:4], 16)
    name_end = 4 + name_len * 2
    name = s[4:name_end].decode('hex')
    op_status = int(s[name_end:name_end+2], 16)
    priority = int(s[name_end+2:name_end+4], 16)
    reserved = s[name_end+4:name_end+6]
    return version, name_len, name, op_status, priority, reserved

这是输出:

>>> parse('010465746830010000')
(1, 4, 'eth0', 1, 0, '00')

答案 3 :(得分:0)

检查以下内容是否有帮助。调用下面的parse方法并将字符串流传递给它,然后迭代获取卡片信息(希望我能帮到你:))parse将返回所需信息的元组。

>>> def getbytes(hs):
    """Returns a generator of bytes from a hex string"""
    return (int(hs[i:i+2],16) for i in range(0,len(hs)-1,2))

>>> def get_single_card_info(g):
    """Fetches a single card info from a byte generator"""
    v = g.next()
    l = g.next()
    name = "".join(chr(x) for x in map(lambda y: y.next(),[g]*l))
    return (str(v),name,g.next(),g.next(),g.next())

>>> def parse(hs):
    """Parses a hex string stream and returns a generator of card infos"""
    bs = getbytes(hs)
    while True:
        yield get_single_card_info(bs)


>>> c = 1
>>> for card in parse("01046574683001000001056574683031010000"):
    print "Card:{0} -> Version:{1}, Id:{2}, Op_stat:{3}, priority:{4}, reserved:{5} bytes".format(c,*card)
    c = c + 1


Card:1 -> Version:1, Id:eth0, Op_stat:1, priority:0, reserved:0 bytes
Card:2 -> Version:1, Id:eth01, Op_stat:1, priority:0, reserved:0 bytes

答案 4 :(得分:0)

Pyparsing包含一个内置表达式,用于解析计数的元素数组,因此这将很好地处理您的“名称”字段。这是整个解析器:

from pyparsing import Word,hexnums,countedArray

# read in 2 hex digits, convert to integer at parse time
octet = Word(hexnums,exact=2).setParseAction(lambda t:int(t[0],16))

# read in a counted array of octets, convert to string
nameExpr = countedArray(octet, intExpr=octet)
nameExpr.setParseAction(lambda t: ''.join(map(chr,t[0])))

# define record expression, with named results
recordExpr = (octet('version') + nameExpr('name') + octet('op_status') +
              octet('priority') #+ octet('reserved'))

解析您的样本:

sample = "01046574683001000004677265300000000266010000"
for rec in recordExpr.searchString(sample):
    print rec.dump()

给出:

[1, 'eth0', 1, 0]
- name: eth0
- op_status: 1
- priority: 0
- version: 1
[0, 'gre0', 0, 0]
- name: gre0
- op_status: 0
- priority: 0
- version: 0
[0, 'f\x01', 0, 0]
- name: f
- op_status: 0
- priority: 0
- version: 0

dump()方法显示可用于访问单独解析位的结果名称,如rec.namerec.version

(我注释掉了保留字节,否则第二个条目将无法正确解析。另外,第三个条目包含一个带有\ x01字节的名称。)