Python / Regex - 使用split

时间:2017-02-10 10:41:57

标签: python regex

我的示例文本如下:

data = """
    NAME: "Chassis", DESCR: "Nexus5548 Chassis"
    PID: N5K-C5548UP       , VID: V01 , SN: SSI1F8A204LK

    NAME: "Module 1", DESCR: "O2 32X10GE/Modular Universal Platform Supervisor"
    PID: N5K-C5548UP       , VID: V01 , SN: FOC1FS7Q2P

    NAME: "Module 2", DESCR: "O2 16X10GE Ethernet Module"
    PID: N55-M16P          , VID: V01 , SN: FOC15840LYH

    NAME: "Fan 1", DESCR: "Chassis fan module"
    PID: N5548P-FAN        , VID: N/A , SN: N/A

    NAME: "Fan 2", DESCR: "Chassis fan module"
    PID: N5548P-FAN        , VID: N/A , SN: N/A

    NAME: "Power supply 1", DESCR: "AC power supply"
    PID: N55-PAC-750W      , VID: V02 , SN: ART18790WA

    NAME: "Power supply 2", DESCR: "AC power supply"
    PID: N55-PAC-750W      , VID: V02 , SN: ART182126V2

    NAME: "Module 3", DESCR: "O2 Daughter Card with L3 ASIC"
    PID: N55-D160L3-V2     , VID: V01 , SN: FOC14952NU2
"""

我试图实现的目的是将每个部分的描述PID和序列分成一个类。

首先我认为id将它们全部放在一行上,然后将这些行分开,以便两条线开始NAME:和PID:将在同一条线上,一旦每条线在同一条线上,我就可以从中获取数据每一行。

我最近的尝试:

data = ''.join(sample.splitlines())
nd = re.split(r"(\NAME:)", data)

这会将名称放在自己的行上,其余的行放在另一行上,这一行是关闭但是我需要删除所有只有NAME的行:on能够迭代

data = ''.join(sample.splitlines())
nd = re.split(r"(SN:\s[\w\-]+)", data)

这很混乱,之前的尝试更接近。

有谁知道如何将数据的每个部分放到一行或更好的方式来做到这一点?

由于

2 个答案:

答案 0 :(得分:0)

以下内容:

import re

data = """
    NAME: "Chassis", DESCR: "Nexus5548 Chassis"
    PID: N5K-C5548UP       , VID: V01 , SN: SSI1F8A204LK

    NAME: "Module 1", DESCR: "O2 32X10GE/Modular Universal Platform Supervisor"
    PID: N5K-C5548UP       , VID: V01 , SN: FOC1FS7Q2P

    NAME: "Module 2", DESCR: "O2 16X10GE Ethernet Module"
    PID: N55-M16P          , VID: V01 , SN: FOC15840LYH

    NAME: "Fan 1", DESCR: "Chassis fan module"
    PID: N5548P-FAN        , VID: N/A , SN: N/A

    NAME: "Fan 2", DESCR: "Chassis fan module"
    PID: N5548P-FAN        , VID: N/A , SN: N/A

    NAME: "Power supply 1", DESCR: "AC power supply"
    PID: N55-PAC-750W      , VID: V02 , SN: ART18790WA

    NAME: "Power supply 2", DESCR: "AC power supply"
    PID: N55-PAC-750W      , VID: V02 , SN: ART182126V2

    NAME: "Module 3", DESCR: "O2 Daughter Card with L3 ASIC"
    PID: N55-D160L3-V2     , VID: V01 , SN: FOC14952NU2
"""

matches = re.findall(r'NAME: \"(.*)\",\s*'
                     r'DESCR: \"(.*)\"\s*'
                     r'PID: (\S+)\s*,\s*'
                     r'VID: (\S+)\s*,\s*'
                     r'SN: (\S+)',
                     data,
                     re.MULTILINE)

print matches

将打印:

[('Chassis', 'Nexus5548 Chassis', 'N5K-C5548UP', 'V01', 'SSI1F8A204LK'), ('Module 1', 'O2 32X10GE/Modular Universal Platform Supervisor', 'N5K-C5548UP', 'V01', 'FOC1FS7Q2P'), ('Module 2', 'O2 16X10GE Ethernet Module', 'N55-M16P', 'V01', 'FOC15840LYH'), ('Fan 1', 'Chassis fan module', 'N5548P-FAN', 'N/A', 'N/A'), ('Fan 2', 'Chassis fan module', 'N5548P-FAN', 'N/A', 'N/A'), ('Power supply 1', 'AC power supply', 'N55-PAC-750W', 'V02', 'ART18790WA'), ('Power supply 2', 'AC power supply', 'N55-PAC-750W', 'V02', 'ART182126V2'), ('Module 3', 'O2 Daughter Card with L3 ASIC', 'N55-D160L3-V2', 'V01', 'FOC14952NU2')]

即。每个条目的NAME,DESCR,PID,VID,SN元组。

答案 1 :(得分:0)

使用python split()函数。它将创建一个包含由空格分隔的字符串的每个部分的数组。然后你可以通过拆分迭代(" / n"),它将按换行符拆分字符串。代码:

for index,line in enumerate(data.split("/n")):
    if (index - 2)%3 == 0:
        PID = line.split()[1]
        serial_number = line.split()[7]
        # here add some code to save the PID and SN whereever you want...

上面的代码将迭代每一行,每三行(从第二行开始)它将执行某些操作 - 由if (index - 2)%3 == 0:条件实现。然后它将按空格分割字符串,您可以通过索引找到所需的PID和序列。

请注意比较行号的条件,因为我不确定index - 2是否准确。也许index - 1将是正确的条件。 你必须自己调整它:)