Python正则表达式多个组

时间:2016-11-07 17:15:15

标签: regex python-3.x grouping

Todo:使用正则表达式来分解驱动器

drives = "8:20-24,30,31,32,10:20-24,30,31,32"

最终输出将如下所示:

formatted_drives = [{8: [20,21,22,23,24,30,31,32]}, {10: [20,21,22,23,24,30,31,32]}]

这是正则表达式目前的样子:

    regex_static_multiple_with_singles = re.match(r"""
    (?P<enc>\d{1,3}):       # Enclosure ID:
    (?P<start>\d+)          # Drive Start
    -                       # Range -
    (?P<end>\d+)            # Drive End
    (?P<singles>,\d+)+      # Drive Singles - todo resolve issue here
    """, drives, (re.IGNORECASE | re.VERBOSE))

以及返回的内容:

[DEBUG  ] All Drive Sequences: ['8:20-24,30,31,32', '10:20-24,30,31,32']
[DEBUG  ] Enclosure ID  : 8
[DEBUG  ] Drive Start   : 20
[DEBUG  ] Drive End     : 24
[DEBUG  ] Drive List    : [20, 21, 22, 23, 24]
[DEBUG  ] Drive Singles : ,32
[DEBUG  ] Enclosure ID  : 10
[DEBUG  ] Drive Start   : 20
[DEBUG  ] Drive End     : 24
[DEBUG  ] Drive List    : [20, 21, 22, 23, 24]
[DEBUG  ] Drive Singles : ,32

问题在于驱动器单打仅返回最后一组。在这种情况下,有3个单驱动器,但是,它是一个可变数量。返回所有单个驱动器的最佳方法是什么?

1 个答案:

答案 0 :(得分:1)

试试这个:

line = "8:20-24,30,31,32,10:21-24,30,31,32,15:11,12,13-14,16-18"
regex = r"(\d+):((?:\d+[-,]|\d+$)+)"
  

以上正则表达式将根据以下内容拆分每个块:我们得到3个匹配:

  1. 8:20-24,30,31,32,
  2. 10:21-24,30,31,32,
  3. 15:11,12,13-14,16-18
  4. 正则表达式2会将每个匹配分成若干段

    regex2 = r"\d+-\d+|\d+"
    
    匹配1的

    ,段是:

     a)20-24
     b)30
     c)31
     d)32
    

    然后其余部分在以下代码中简单且自我解释:

    #!/usr/bin/python
    import re
    regex = r"(\d+):((?:\d+[-,]|\d+$)+)"
    line = "8:20-24,30,31,32,10:21-24,30,31,32,15:11,12,13-14,16-18"
    regex2 = r"\d+-\d+|\d+"
    
    d={}
    
    matchObj = re.finditer(regex,line, re.MULTILINE)
    
    for matchNum, match in enumerate(matchObj):
        #print (match.group(2))
        match2 = re.finditer(regex2,match.group(2))
        for matchNum1, m in enumerate(match2):
            key=int(match.group(1))
            if '-' in m.group():
                y = m.group().split('-')
                for i in xrange(int(y[0]),int(y[1])+1):
                    if key in d:
                        d[key].append(i)
                    else:
                        d[key] = [i,]
            else:
                    if key in d:
                        d[key].append(int(m.group()))
                    else:
                        d[key] = [int(m.group()),]          
    print(d)    
    

    run the code here

    示例输出:

    {8: [20, 21, 22, 23, 24, 30, 31, 32], 10: [21, 22, 23, 24, 30, 31, 32], 15: [11, 12, 13, 14, 16, 17, 18]}