python表达式不起作用

时间:2015-01-27 21:19:53

标签: python extraction findall

我目前正在尝试从我读过的文件中提取数据:

with open(_args.file, 'ru') as _fpInput:
    # Remove empty lines
    lines = [l for l in _fpInput if(l[:-1])]

    # Process special characters
    clean = ''.join(lines)
    clean = ''.join(clean.split('\x08'))    # Remove Backspace
    clean = ''.join(clean.split('\x1b'))    # Remove Escape
    clean = ''.join(clean.split('\r'))      # Remove Carriage Return

    # Remove space bar print, to print, another page
    _lines = re.sub('\[7m--More--\[27m', '', clean)

然后我执行find all:

chunk = ''.join(re.findall('show[a-z \t\b]*[ \t\b][a-z \t\b]*cell[a-z \t\b]*[ \t][a-z \t\b]*umts\s+(.*?)\[ok\]', lines, re.DOTALL|re.I)).split('\n')

还有:

chunk = ''.join(re.findall('show[a-z \t\b]*[ \t\b][a-z \t\b]*cell[a-z \t\b]*[ \t][a-z \t\b]*lte\s+(.*?)\[ok\]',  lines, re.DOTALL|re.I)).split('\n')

当我浏览文件1时,所有工作都按预期工作,但在文件2中它只是不起作用。

文件1:

show Cell UMTS 
CellHandle  Name              RN     CID    UCID        PSC  MaxTxPwr  ModeInUse     RLs
----------  ----------------  -----  -----  ----------  ---  --------  ------------  ---
 59086  V87577153            15  59086     8578766  1     24.0dBm  UMTSNodeB       0
 59087  V87577143            14  59087     8578767  2     24.0dBm  UMTSNodeB       0
 59088  V87577133            13  59088     8578768  506   24.0dBm  UMTSNodeB       2
 59089  V87577123            12  59089     8578769  507   24.0dBm  UMTSNodeB       0
 59090  V87577113            11  59090     8578770  3     24.0dBm  UMTSNodeB       2
 59091  V87577103            10  59091     8578771  4     24.0dBm  UMTSNodeB       3
 59092  V87577093             9  59092     8578772  5     24.0dBm  UMTSNodeB       1
 59093  V87577083             8  59093     8578773  508   24.0dBm  UMTSNodeB       3
 59094  V87577073             7  59094     8578774  509   24.0dBm  UMTSNodeB       2
 59095  V87577063             6  59095     8578775  6     24.0dBm  UMTSNodeB       5
 59096  V87577053             5  59096     8578776  7     24.0dBm  UMTSNodeB       0
 59097  V87577043             4  59097     8578777  8     24.0dBm  UMTSNodeB       1
 59098  V87577033             3  59098     8578778  510   24.0dBm  UMTSNodeB       3
 59099  V87577023             2  59099     8578779  511   24.0dBm  UMTSNodeB       1
 59100  V87577013             1  59100     8578780  9     24.0dBm  UMTSNodeB       3
[ok][2014-11-12 14:52:07]
> show Cell lte  
CellHandle  Name              RN     CellID  ECI         PCI  RefSignalPwr  ModeInUse     RLs
----------  ----------------  -----  ------  ----------  ---  ------------  ------------  ---
     1  SCRN1_89_489          1      89      439385  489           -10     LTEeNodeB    5
     2  SCRN2_90_490          2      90      439386  490           -10     LTEeNodeB    3
     3  SCRN3_91_491          3      91      439387  491           -10     LTEeNodeB    2
     4  SCRN4_92_492          4      92      439388  492           -10     LTEeNodeB    2
     5  SCRN5_93_493          5      93      439389  493           -10     LTEeNodeB    0
     6  SCRN6_94_494          6      94      439390  494           -10     LTEeNodeB    3
     7  SCRN7_95_495          7      95      439391  495           -10     LTEeNodeB    2
     8  SCRN8_96_496          8      96      439392  496           -10     LTEeNodeB    2
     9  SCRN9_97_497          9      97      439393  497           -10     LTEeNodeB    5
    10  SCRN10_98_498        10      98      439394  498           -10     LTEeNodeB    7
    11  SCRN11_99_499        11      99      439395  499           -10     LTEeNodeB    1
    12  SCRN12_50_500        12      50      439346  500           -10     LTEeNodeB    7
    13  SCRN13_51_501        13      51      439347  501           -10     LTEeNodeB    1
    14  SCRN14_52_502        14      52      439348  502           -10     LTEeNodeB    3
    15  SCRN15_53_503        15      53      439349  503           -10     LTEeNodeB    0
[ok][2014-11-12 14:59:50]
> 
*** IDLE TIMEOUT ***

file2的:

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.01.16 16:06:37 =~=~=~=~=~=~=~=~=~=~=~=
> show cell umCell UMTS 
CellHandle  Name              RN     CID    UCID        PSC  MaxTxPwr  ModeInUse     RLs

----------  ----------------  -----  -----  ----------  ---  --------  ------------  ---

 59086  V87577153            15  59086     8578766  1     24.0dBm  UMTSNodeB       0

 59087  V87577143            14  59087     8578767  2     24.0dBm  UMTSNodeB       0

 59088  V87577133            13  59088     8578768  506   24.0dBm  UMTSNodeB       0

 59089  V87577123            12  59089     8578769  3     24.0dBm  UMTSNodeB       0

 59090  V87577113            11  59090     8578770  4     24.0dBm  UMTSNodeB       0

 59091  V87577103            10  59091     8578771  507   24.0dBm  UMTSNodeB       0

 59092  V87577093             9  59092     8578772  508   24.0dBm  UMTSNodeB       0

 59093  V87577083             8  59093     8578773  509   24.0dBm  UMTSNodeB       0

 59094  V87577073             7  59094     8578774  5     24.0dBm  UMTSNodeB       0

 59095  V87577063             6  59095     8578775  7     24.0dBm  UMTSNodeB       0

 59096  V87577053             5  59096     8578776  8     24.0dBm  UMTSNodeB       0

 59097  V87577043             4  59097     8578777  9     24.0dBm  UMTSNodeB       0

 59098  V87577033             3  59098     8578778  510   24.0dBm  UMTSNodeB       0

 59099  V87577023             2  59099     8578779  511   24.0dBm  UMTSNodeB       0

 59100  V87577013             1  59100     8578780  10    24.0dBm  UMTSNodeB       0
[ok][2015-01-17 00:06:51]
> show cell lCell LTE 
CellHandle  Name              RN     CellID  ECI         PCI  RefSignalPwr  ModeInUse     RLs

----------  ----------------  -----  ------  ----------  ---  ------------  ------------  ---

     1  ZD87577018            1       1      439297    6           -10     LTEeNodeB    0

     2  ZD87577028            2       2      439298    1           -10     LTEeNodeB    0

     3  ZD87577038            3       3      439299   39           -10     LTEeNodeB    0

     4  ZD87577048            4       4      439300   35           -10     LTEeNodeB    0

     5  ZD87577058            5       5      439301    3           -10     LTEeNodeB    0

     6  ZD87577068            6       6      439302   46           -10     LTEeNodeB    0

     7  ZD87577078            7       7      439303   42           -10     LTEeNodeB    0

     8  ZD87577088            8       8      439304   53           -10     LTEeNodeB    0

     9  ZD87577098            9       9      439305   22           -10     LTEeNodeB    0

    10  ZD87577108           10      10      439306   10           -10     LTEeNodeB    0

    11  ZD87577118           11      11      439307   49           -10     LTEeNodeB    0

    12  ZD87577128           12      12      439308   54           -10     LTEeNodeB    0

    13  ZD87577138           13      13      439309   62           -10     LTEeNodeB    0

    14  ZD87577148           14      14      439310   38           -10     LTEeNodeB    0

    15  ZD87577158           15      15      439311   58           -10     LTEeNodeB    0

[ok][2015-01-17 00:07:02]
> show rfRFMgmt umUMTS deDetectedCells 


List Of Cells Detected By Internal Cell With Cell Handle 59086, CID 59086, And Cell ID 59086:

我不明白为什么findall re表达式在一个文件中工作而不在第二个文件中。任何指针都会非常感激。

第一个提取物正常工作:

['CellHandle  Name              RN     CID    UCID        PSC  MaxTxPwr  ModeInUse     RLs', '----------  ----------------  -----  -----  ----------  ---  --------  ------------  ---', '     59086  V87577153            15  59086     8578766  1     24.0dBm  UMTSNodeB       0', '     59087  V87577143            14  59087     8578767  2     24.0dBm  UMTSNodeB       0', '     59088  V87577133            13  59088     8578768  506   24.0dBm  UMTSNodeB       0', '     59089  V87577123            12  59089     8578769  3     24.0dBm  UMTSNodeB       0', '     59090  V87577113            11  59090     8578770  4     24.0dBm  UMTSNodeB       0', '     59091  V87577103            10  59091     8578771  507   24.0dBm  UMTSNodeB       0', '     59092  V87577093             9  59092     8578772  508   24.0dBm  UMTSNodeB       0', '     59093  V87577083             8  59093     8578773  509   24.0dBm  UMTSNodeB       0', '     59094  V87577073             7  59094     8578774  5     24.0dBm  UMTSNodeB       0', '     59095  V87577063             6  59095     8578775  7     24.0dBm  UMTSNodeB       0', '     59096  V87577053             5  59096     8578776  8     24.0dBm  UMTSNodeB       0', '     59097  V87577043             4  59097     8578777  9     24.0dBm  UMTSNodeB       0', '     59098  V87577033             3  59098     8578778  510   24.0dBm  UMTSNodeB       0', '     59099  V87577023             2  59099     8578779  511   24.0dBm  UMTSNodeB       0', '     59100  V87577013             1  59100     8578780  10    24.0dBm  UMTSNodeB       0']

但是第二个提取物,' [ok]'只是没有找到:

['CellHandle  Name              RN     CellID  ECI         PCI  RefSignalPwr  ModeInUse     RLs', '----------  ----------------  -----  ------  ----------  ---  ------------  ------------  ---', '         1  ZD87577018            1       1      439297    6           -10     LTEeNodeB    0', '         2  ZD87577028            2       2      439298    1           -10     LTEeNodeB    0', '         3  ZD87577038            3       3      439299   39           -10     LTEeNodeB    0', '         4  ZD87577048            4       4      439300   35           -10     LTEeNodeB    0', '         5  ZD87577058            5       5      439301    3           -10     LTEeNodeB    0', '         6  ZD87577068            6       6      439302   46           -10     LTEeNodeB    0', '         7  ZD87577078            7       7      439303   42           -10     LTEeNodeB    0', '         8  ZD87577088            8       8      439304   53           -10     LTEeNodeB    0', '         9  ZD87577098            9       9      439305   22           -10     LTEeNodeB    0', '        10  ZD87577108           10      10      439306   10           -10     LTEeNodeB    0', '        11  ZD87577118           11      11      439307   49           -10     LTEeNodeB    0', '        12  ZD87577128           12      12      439308   54           -10     LTEeNodeB    0', '        13  ZD87577138           13      13      439309   62           -10     LTEeNodeB    0', '        14  ZD87577148           14      14      439310   38           -10     LTEeNodeB    0', '        15  ZD87577158           15      15      439311   58           -10     LTEeNodeB    0', 'DetectedCells ', 'List Of Cells Detected By Internal LTE Cell With Cell Handle 1 And CellID 1:', '-----------------------------------------------------------------------', 'Detected Internal LTE Cells:', '============================', 'Cell Handle  Cell ID     PCI  EUARFCN  DL Bandwidth  UL Bandwidth  RSRP*', '-----------  ----------  ---  -------  ------------  ------------  -----', '          2      439298    1     2850           100           100    -71', '          3      439299   39     2850           100           100    -83', '          4      439300   35     2850           100           100    -87', '          5      439301    3     2850           100           100    -94', '          6      439302   46     2850           100           100   -118', '          7      439303   42     2850           100           100   -103', '          8      439304   53     2850           100           100   -102', '          9      439305   22     2850           100           100    -99', '         10      439306   10     2850           100           100   -101', '         11      439307   49     2850           100           100   -115', '         12      439308   54     2850           100           100   -123', '         13      439309   62     2850           100           100    -95', '         14      439310   38     2850           100           100    -96', '         15      439311   58     2850           100           100   -110', '* Measured When Detected Internal LTE Cell Was Transmitting At FAPService Maximum refSigPower', 'Detected External LTE Cells:', '============================', 'Cell Handle  Cell ID     PCI  EUARFCN  DL Bandwidth  UL Bandwidth  RSTxPower  RootSequence  PrachCfgIndex  RSRP', '-----------  ----------  ---  -------  ------------  ------------  ---------  ------------  -------------  ----', 'Detected UMTS Cells:', '====================', 'Cell Handle  CID    UCID        PSC  DL UARFCN  PCPICHTxPower  CPICH RSCP', '-----------  -----  ----------  ---  ---------  -------------  ----------', '      67523  41343    46309759    5      10736             28         -61', '      67525  53109    46321525    4      10712             28         -59', '      67540  59100     8578780   10      10687              0        3277', 'Detected GSM Cells:', '===================', 'Cell Handle  Frequency Band  ARFCN  BSIC  CI     RSSI', '-----------  --------------  -----  ----  -----  ----', '      67547         GSM 900     87    49   9155   -33', '      67554         GSM 900     93    27  18836   -41', '      67557         GSM 900     89    17  21847   -38', '      67558         GSM 900     83    39   9660   -40', '      67559         GSM 900     67     8    616   -29', '      67560         GSM 900     75    50  18765   -38', '--More--        ', 'List Of Cells Detected By Internal LTE Cell With Cell Handle 2 And CellID 2:', '-----------------------------------------------------------------------', 'Detected Internal LTE Cells:', '============================', 'Cell Handle  Cell ID     PCI  EUARFCN  DL Bandwidth  UL Bandwidth  RSRP*', '-----------  ----------  ---  -------  ------------  ------------  -----', '          1      439297    6     2850           100           100    -71', '          3      439299   39     2850           100           100    -79', '          4      439300   35     2850           100           100   -104', '          5      439301    3     2850           100           100    -81', '          6      439302   46     2850           100           100    -99', '          7      439303   42     2850           100           100    -95', '          8      439304   53     2850           100           100    -91', '          9      439305   22     2850           100           100   -108', '         10      439306   10     2850           100           100    -95', '         11      439307   49     2850           100           100   -108', '         12      439308   54     2850           100           100    -95', '         13      439309   62     2850           100           100    -99', '         14      439310   38     2850           100           100   -110', '         15      439311   58     2850           100           100    -89', '* Measured When Detected Internal LTE Cell Was Transmitting At FAPService Maximum refSigPower', 'Detected External LTE Cells:', '============================', 'Cell Handle  Cell ID     PCI  EUARFCN  DL Bandwidth  UL Bandwidth  RSTxPower  RootSequence  PrachCfgIndex  RSRP', '-----------  ----------  ---  -------  ------------  ------------  ---------  ------------  -------------  ----', '      67519       57354   65     6300            50            50          0             1              0   -61', '      67582       57372   17     2850           100           100         15             1              5  -111', 'Detected UMTS Cells:', '====================', 'Cell Handle  CID    UCID        PSC  DL UARFCN  PCPICHTxPower  CPICH RSCP', '-----------  -----  ----------  ---  ---------  -------------  ----------', '      67521  47492    46315908   69      10736             33         -61', '      67523  41343    46309759    5      10736             28         -58', '      67524  52965    46321381   68      10712             33         -64', '      67525  53109    46321525    4      10712             28         -61', '      67539  59099     8578779  511      10687              0        3277', 'Detected GSM Cells:', '===================', 'Cell Handle  Frequency Band  ARFCN  BSIC  CI     RSSI', '-----------  --------------  -----  ----  -----  ----', '      67549        DCS 1800    555    33   3756   -28', '      67554         GSM 900     93    27  18836   -39', '      67557         GSM 900     89    17  21847   -38', '--More--        Aborted: by user']

1 个答案:

答案 0 :(得分:0)

以下是我将如何解析文件。 parse()返回已解析表的列表 - 每个表都是一个字典列表,每个字典代表文件中的一行。

import re

def parse(rows):
  tables = []
  while True:
    r = next(rows, None)
    if r is None: break
    # find the header line
    if r.startswith("CellHandle  Name "):
      next(rows)
      headers = r.split()
      table = []
      while True:
        r = next(rows, None)
        if r is None: break
        vals = r.split()
        # skip blank lines or the lines with dashes
        if len(vals) == 0 or re.match("\A-+\Z", vals[0]): continue
        # stop if first column is not a number
        if not re.match("\A\d+\Z", vals[0]): break
        item = dict(zip(headers, vals))
        table.append(item)
      tables.append(table)
  return tables

def test(path):
  print "for file", path
  with open(path, 'r') as h:
    tables = parse(h)
    print "  - tables found:", len(tables)
    for i,t in enumerate(tables):
      print "  * rows in table ", (i+1), ":", len(t)
      for item in t: print item

test("file1")
test("file2")