选择/匹配Python中的部分

时间:2019-01-16 10:44:48

标签: python regex match multiline findall

我正在尝试使用python中的以下正则表达式来匹配OSPF数据库中任何链接状态类型的每个部分,如下面的CLI_Output所示:

regex = r'\n\n(\s+\S+( \S+)?(.+?)\n\n)(\s+\S+( \S+)?)?'
section = re.findall(regex,_original_result, re.M)

但是我仅在标题行之后得到(第一行)

i.e.
                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2

这是我的CLI输出:

CLI_Output = '''
                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2
10.200.254.252  10.200.254.252  97          0x80000003 0x00501E 3

                Net Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.254.242  10.189.254.242  1452       0x80001cf4 0xefab
10.189.0.242    10.189.0.242    1452       0x80001cf4 0xefab

                Summary Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum     Route
10.189.127.0    10.189.254.242  10         0x80001cde 0x6602     10.189.127.0/29
10.200.0.0      10.200.254.251  130        0x80000001 0x002675   10.200.0.0/16
172.18.200.1    10.200.254.251  109        0x80000001 0x00B5CB   172.18.200.1/32

                ASBR-Summary Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.127.3    10.189.254.242  10         0x80001c30 0xc14a

                Router Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.127.3    10.189.127.3    1707       0x80001d5e 0xa509   1
10.189.254.242  10.189.254.242  10         0x80001ce0 0x8ec2   1

                Net Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.127.2    10.189.254.243  70         0x80001c31 0xdb72

                Summary Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum    Route
10.189.254.240  10.189.254.242  371        0x80001cda 0x8a71     10.189.254.240/29
10.189.254.240  10.189.254.243  1813       0x80001cda 0x8476     10.189.254.240/29

                ASBR-Summary Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.7.250    10.189.254.242  1442       0x8000154f 0x165e
10.189.7.250    10.189.254.243  1242       0x8000154d 0x1461

                Router Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2
10.189.254.243  10.189.254.243  1552       0x80001ce8 0x164e   1

                Net Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum
10.200.254.241  10.200.254.251  1277 80000001 ef90  0002

                Summary Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum     Route
0.0.0.0         10.200.254.251  1317 80000001 b7b0  0002 0.0.0.0/0
0.0.0.0         10.200.254.252  1317 80000001 b1b5  0002 0.0.0.0/0

                NSSA-external Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age  Seq#     CkSum Flag Route         Tag
10.200.1.0      172.18.200.1    365  800011cb 6f90  0031 E2 10.200.1.0/24   0
10.200.2.0      172.18.200.1    1735 800011c7 6c96  0031 E2 10.200.2.0/24   0
10.200.3.0      172.18.200.1    1775 800011c9 5da2  0031 E2 10.200.3.0/24   0

                AS External Link States

Link ID         ADV Router      Age  Seq#     CkSum Flag Route         Tag
0.0.0.0         10.189.7.250    384  800129e9 9a51  0012 E2 0.0.0.0/0       0
2.3.4.0         10.189.7.250    1154 80007a7a 1fe2  0012 E2 2.3.4.0/24      0
10.112.0.0      10.189.7.250    1084 8000d7e3 b31d  0012 E2 10.112.0.0/21   0

有人可以帮助我,我的正则表达式如何查找以获取完整部分?

i.e.
                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2
10.200.254.252  10.200.254.252  97          0x80000003 0x00501E 3

非常感谢 矩阵154

2 个答案:

答案 0 :(得分:0)

就像我在评论中提到的那样,我相信逐行阅读会更容易:

record = 0
results = []

for line in CLI_Output.split("\n"):
    # skip empty lines
    if line == "" and record < 2:
        continue

    # if Router Link is in header
    if line.find('Router Link') > -1:
        record = 1
        continue

    # headers
    if record == 1:
        record = 2
        continue

    # If we are here, we are getting the data lines
    if record == 2 and line != "":
        results.append(line)
    elif line == "":
        record = 0  # use break here if you want to stop after the first chunk

print(results)

结果:

['10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2',
 '10.200.254.252  10.200.254.252  97          0x80000003 0x00501E 3',
 '10.189.127.3    10.189.127.3    1707       0x80001d5e 0xa509   1',
 '10.189.254.242  10.189.254.242  10         0x80001ce0 0x8ec2   1',
 '10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2',
 '10.189.254.243  10.189.254.243  1552       0x80001ce8 0x164e   1']

codepad demo


虽然这并不意味着不可能使用正则表达式,但是它更复杂,例如,如果您不习惯使用语法,则很难理解以下内容:

^\ +(.+) Router Link States\s*(?:\([^)]+\))?\s+Link ID.+\s+((?:[^\r\n]+[\r\n]?)+)

regex101 demo

答案 1 :(得分:0)

我在perl和php中编写了许多代码,并且是python的新手。在perl或php中选择部分比较容易,所以我想在python中做同样的事情,但是不幸的是未成功

@Jerry:非常感谢您!我按照您的建议逐行阅读了CLI输出。这是我想在这里分享的代码。也许其他人也需要它

import re
g = globals()

def ParseText(_original_result):
  ls_types = []
  dic = {}
  (ls_name, area_id, area_type) = ('', '', '')

  for line in _original_result.split("\n"):
    if (line.strip() and not re.search('Link ID\s+', line)):
      regex = r'^\s+(\S+|\S+ \S+) Link States'
      ls = re.findall(regex, line, re.S)
      if ls: ls_types.append(re.sub('(\s|-)', '_', ls[0]))

  ls_types = list(set(ls_types))

  for line in _original_result.split("\n"):
    if (line.strip() and not re.search('Link ID\s+', line)):
      matcher = list(re.finditer(r'^\s+(?P<ls_name>\S+( \S+)?) Link States( \(Area (?P<area_id>\S+)( \[(?P<area_type>\S+)\])?\))?', line, re.S|re.M))
      if matcher:
        ls_name = re.sub('(\s|-)', '_', matcher[0]['ls_name'])
        if not ls_name in g:
          g[ls_name] = { 'area_id': [], 'area_type': [], 'link_id': [], 'adv_rtr': [], 'age': [], 'sumary': [], 'ext_route_type': [], 'ext_route': [], 'tag': [], '$_columns' : ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag'] }
        area_id = matcher[0]['area_id']
        if matcher[0]['area_type']:
          area_type = matcher[0]['area_type']
        else:
          if area_id == '0.0.0.0': area_type = 'backbone'
          else: area_type = 'normal'
      else:
        matcher = list(re.finditer(r'^(?P<lnk_id>\S+)\s+(?P<adv_rtr>\S+)\s+(?P<age>\d+)(\s+\S+){2}(\s+(\d+\s+)?((?P<sumary>\S+)|(?P<ext_route_type>\S+)\s+(?P<ext_route>\S+)\s+(?P<tag>\d+)))?$', line, re.M|re.S))
        if matcher:
          if ls_name != 'AS_External':
            g[ls_name]['area_id'].append(area_id)
            g[ls_name]['area_type'].append(area_type)
          g[ls_name]['link_id'].append(matcher[0]['lnk_id'])
          g[ls_name]['adv_rtr'].append(matcher[0]['adv_rtr'])
          g[ls_name]['age'].append(matcher[0]['age'])
          if matcher[0]['sumary']:
            g[ls_name]['sumary'].append(matcher[0]['sumary'])
          if matcher[0]['ext_route_type']:
            g[ls_name]['ext_route_type'].append(matcher[0]['ext_route_type'])
          if matcher[0]['ext_route']:
            g[ls_name]['ext_route'].append(matcher[0]['ext_route'])
          if matcher[0]['tag']:
            g[ls_name]['tag'].append(matcher[0]['tag'])

  for LS in g:
    if LS in ls_types:
      dic[LS]= g[LS]     

  return dic

CLI_Output = '''
Forinet $get router info ospf database brief



                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2
10.189.254.242  10.189.254.242  371        0x80001ce0 0x2847   1
10.189.254.243  10.189.254.243  1552       0x80001ce8 0x164e   1
10.200.254.251  10.200.254.251  93          0x80000003 0x002052 3
10.200.254.252  10.200.254.252  97          0x80000003 0x00501E 3

                Net Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.254.242  10.189.254.242  1452       0x80001cf4 0xefab

                Summary Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum     Route
10.189.127.0    10.189.254.242  10         0x80001cde 0x6602     10.189.127.0/29
10.189.127.0    10.189.254.243  1452       0x80001cdc 0x6405     10.189.127.0/29
10.200.0.0      10.200.254.251  130        0x80000001 0x002675   10.200.0.0/16
10.200.0.0      10.200.254.252  146        0x80000001 0x00207A   10.200.0.0/16
172.18.200.1    10.200.254.251  109        0x80000001 0x00B5CB   172.18.200.1/32
172.18.200.1    10.200.254.252  108        0x80000001 0x00AFD0   172.18.200.1/32

                ASBR-Summary Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.127.3    10.189.254.242  10         0x80001c30 0xc14a
10.189.127.3    10.189.254.243  60         0x80001c4b 0x856a

                Router Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.127.3    10.189.127.3    1707       0x80001d5e 0xa509   1
10.189.254.242  10.189.254.242  10         0x80001ce0 0x8ec2   1
10.189.254.243  10.189.254.243  70         0x80001ce6 0x80c7   1

                Net Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.127.2    10.189.254.243  70         0x80001c31 0xdb72

                Summary Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum    Route
10.189.254.240  10.189.254.242  371        0x80001cda 0x8a71     10.189.254.240/29
10.189.254.240  10.189.254.243  1813       0x80001cda 0x8476     10.189.254.240/29
10.200.254.250  10.189.254.242  1442       0x80001548 0x0673     10.189.254.250/32
10.200.254.250  10.189.254.243  1242       0x80001548 0xff78     10.189.254.250/32

                ASBR-Summary Link States (Area 1.1.1.1)

Link ID         ADV Router      Age        Seq#       Checksum
10.189.7.250    10.189.254.242  1442       0x8000154f 0x165e
10.189.7.250    10.189.254.243  1242       0x8000154d 0x1461

                Router Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum Link Count
10.189.7.250    10.189.7.250    1102       0x80012fa1 0x6b32   2
10.189.254.242  10.189.254.242  371        0x80001ce0 0x2847   1
10.189.254.243  10.189.254.243  1552       0x80001ce8 0x164e   1

                Net Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum
10.200.254.241  10.200.254.251  1277 80000001 ef90  0002

                Summary Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age        Seq#       Checksum     Route
0.0.0.0         10.200.254.251  1317 80000001 b7b0  0002 0.0.0.0/0
0.0.0.0         10.200.254.252  1317 80000001 b1b5  0002 0.0.0.0/0

                NSSA-external Link States (Area 2.2.2.2 [NSSA])

Link ID         ADV Router      Age  Seq#     CkSum Flag Route              Tag
10.200.1.0      172.18.200.1    365  800011cb 6f90  0031 E2 10.200.1.0/24   0
10.200.2.0      172.18.200.1    1735 800011c7 6c96  0031 E2 10.200.2.0/24   0
10.200.3.0      172.18.200.1    1775 800011c9 5da2  0031 E2 10.200.3.0/24   0
10.200.4.0      172.18.200.1    1555 800011c9 43be  0031 E2 10.200.4.0/22   0
10.200.8.0      172.18.200.1    1585 800011c8 28d3  0031 E2 10.200.8.0/24   0
10.200.234.0    172.18.200.1    1525 800011c7 6aaf  0031 E2 10.200.234.0/24 0

                AS External Link States

Link ID         ADV Router      Age  Seq#     CkSum Flag Route              Tag
0.0.0.0         10.189.7.250    384  800129e9 9a51  0012 E2 0.0.0.0/0       0
2.3.4.0         10.189.7.250    1154 80007a7a 1fe2  0012 E2 2.3.4.0/24      0
10.112.0.0      10.189.7.250    1084 8000d7e3 b31d  0012 E2 10.112.0.0/21   0
10.112.189.0    10.189.7.250    144  8000e95e 84fa  0012 E2 10.112.189.0/24 0
10.158.189.0    10.189.7.250    124  800129db 9df5  0012 E2 10.158.189.0/24 0
10.180.128.0    10.189.7.250    1264 800129da 15ad  0012 E2 10.180.128.0/21 0
10.188.0.0      10.189.7.250    1314 800129d5 2b4d  0012 E2 10.188.0.0/18   0
10.189.0.0      10.189.7.250    1344 800129d8 320a  0012 E2 10.189.0.0/21   0
10.189.8.0      10.189.7.250    1504 8000d057 0801  0012 E2 10.189.8.0/23   0
10.189.10.0     10.189.7.250    334  800129da e246  0012 E2 10.189.10.0/24  0
10.189.11.0     10.189.7.250    1534 800129da d750  0012 E2 10.189.11.0/24  0
10.189.14.0     10.189.7.250    1204 800129e5 a079  0012 E2 10.189.14.0/24  0
10.189.15.0     10.189.7.250    784  8000c59c 2ca1  0012 E2 10.189.15.0/29  0
10.189.20.0     10.189.7.250    914  800129e0 68b0  0012 E2 10.189.20.0/24  0

'''

print (ParseText(CLI_Output))

输出看起来像这样(apendig的顺序是粘滞的,所以不用担心重新排序问题:

{
  'Router': {
    'area_id': ['0.0.0.0', '0.0.0.0', '0.0.0.0', '0.0.0.0', '0.0.0.0', '1.1.1.1', '1.1.1.1', '1.1.1.1', '2.2.2.2', '2.2.2.2', '2.2.2.2'],
    'area_type': ['backbone', 'backbone', 'backbone', 'backbone', 'backbone', 'normal', 'normal', 'normal', 'NSSA', 'NSSA', 'NSSA'],
    'link_id': ['10.189.7.250', '10.189.254.242', '10.189.254.243', '10.200.254.251', '10.200.254.252', '10.189.127.3', '10.189.254.242', '10.189.254.243', '10.189.7.250', '10.189.254.242', '10.189.254.243'],
    'adv_rtr': ['10.189.7.250', '10.189.254.242', '10.189.254.243', '10.200.254.251', '10.200.254.252', '10.189.127.3', '10.189.254.242', '10.189.254.243', '10.189.7.250', '10.189.254.242', '10.189.254.243'],
    'age': ['1102', '371', '1552', '93', '97', '1707', '10', '70', '1102', '371', '1552'],
    'sumary': ['2', '1', '1', '3', '3', '1', '1', '1', '2', '1', '1'],
    'ext_route_type': [],
    'ext_route': [],
    'tag': [],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  },
  'Net': {
    'area_id': ['0.0.0.0', '1.1.1.1', '2.2.2.2'],
    'area_type': ['backbone', 'normal', 'NSSA'], 'link_id':['10.189.254.242', '10.189.127.2', '10.200.254.241'],
    'adv_rtr': ['10.189.254.242', '10.189.254.243', '10.200.254.251'],
    'age': ['1452', '70', '1277'],
    'sumary': ['0002'],
    'ext_route_type': [],
    'ext_route': [],
    'tag': [],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  },
  'Summary': {
    'area_id': ['0.0.0.0', '0.0.0.0', '0.0.0.0', '0.0.0.0', '0.0.0.0', '0.0.0.0', '1.1.1.1', '1.1.1.1', '1.1.1.1', '1.1.1.1', '2.2.2.2', '2.2.2.2'],
    'area_type': ['backbone', 'backbone', 'backbone', 'backbone','backbone', 'backbone', 'normal', 'normal', 'normal', 'normal', 'NSSA', 'NSSA'],
    'link_id': ['10.189.127.0', '10.189.127.0', '10.200.0.0', '10.200.0.0', '172.18.200.1', '172.18.200.1', '10.189.254.240', '10.189.254.240', '10.200.254.250', '10.200.254.250', '0.0.0.0', '0.0.0.0'],
    'adv_rtr': ['10.189.254.242', '10.189.254.243', '10.200.254.251', '10.200.254.252', '10.200.254.251', '10.200.254.252', '10.189.254.242', '10.189.254.243', '10.189.254.242', '10.189.254.243', '10.200.254.251', '10.200.254.252'],
    'age': ['10', '1452', '130', '146', '109', '108', '371', '1813', '1442', '1242', '1317', '1317'],
    'sumary': ['10.189.127.0/29', '10.189.127.0/29', '10.200.0.0/16', '10.200.0.0/16', '172.18.200.1/32', '172.18.200.1/32', '10.189.254.240/29', '10.189.254.240/29', '10.189.254.250/32', '10.189.254.250/32', '0.0.0.0/0', '0.0.0.0/0'],
    'ext_route_type': [],
    'ext_route': [],
    'tag': [],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  },
  'ASBR_Summary': {
    'area_id': ['0.0.0.0', '0.0.0.0', '1.1.1.1', '1.1.1.1'],
    'area_type': ['backbone', 'backbone', 'normal', 'normal'],
    'link_id': ['10.189.127.3', '10.189.127.3', '10.189.7.250', '10.189.7.250'],
    'adv_rtr': ['10.189.254.242', '10.189.254.243', '10.189.254.242', '10.189.254.243'],
    'age': ['10', '60', '1442', '1242'],
    'sumary': [],
    'ext_route_type': [],
    'ext_route': [],
    'tag': [],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  },
  'NSSA_external': {
    'area_id': ['2.2.2.2', '2.2.2.2', '2.2.2.2', '2.2.2.2', '2.2.2.2', '2.2.2.2'],
    'area_type': ['NSSA', 'NSSA', 'NSSA', 'NSSA', 'NSSA', 'NSSA'],
    'link_id': ['10.200.1.0', '10.200.2.0', '10.200.3.0', '10.200.4.0', '10.200.8.0', '10.200.234.0'],
    'adv_rtr': ['172.18.200.1', '172.18.200.1', '172.18.200.1', '172.18.200.1', '172.18.200.1', '172.18.200.1'],
    'age': ['365', '1735', '1775', '1555', '1585', '1525'], 'sumary':[],
    'ext_route_type': ['E2', 'E2', 'E2', 'E2', 'E2', 'E2'],
    'ext_route': ['10.200.1.0/24', '10.200.2.0/24', '10.200.3.0/24', '10.200.4.0/22', '10.200.8.0/24', '10.200.234.0/24'], 'tag':['0', '0', '0', '0', '0', '0'],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  },
  'AS_External': {
    'area_id': [],
    'area_type': [],
    'link_id': ['0.0.0.0', '2.3.4.0', '10.112.0.0', '10.112.189.0', '10.158.189.0', '10.180.128.0', '10.188.0.0', '10.189.0.0', '10.189.8.0', '10.189.10.0', '10.189.11.0', '10.189.14.0', '10.189.15.0', '10.189.20.0'],
    'adv_rtr': ['10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250', '10.189.7.250'],
    'age': ['384', '1154', '1084', '144', '124', '1264', '1314', '1344', '1504', '334', '1534', '1204', '784', '914'],
    'sumary': [],
    'ext_route_type': ['E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2', 'E2'],
    'ext_route': ['0.0.0.0/0', '2.3.4.0/24', '10.112.0.0/21', '10.112.189.0/24', '10.158.189.0/24', '10.180.128.0/21', '10.188.0.0/18', '10.189.0.0/21', '10.189.8.0/23', '10.189.10.0/24', '10.189.11.0/24', '10.189.14.0/24', '10.189.15.0/29', '10.189.20.0/24'],
    'tag': ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'],
    '$_columns': ['area_id', 'area_type', 'link_id', 'adv_rtr', 'age', 'sumary', 'ext_route_type', 'ext_route', 'tag']
  }
}