重新识别字符串的范围

时间:2013-12-04 06:45:18

标签: python regex list parsing

如何写和正则表达式从字符串中获取列表,如果我们有字符串:

value = '88-94'
value = '88 to 94'
value = '88'
value = '88-94, 96-108'

结果应该是:

[88, 89, 90, 91, 92, 93, 94]
[88, 89, 90, 91, 92, 93, 94]
[88]
[88, 89, 90, 91, 92, 93, 94, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108]

编程语言是python2.7

这是python2.7和regex的工作解决方案,但是必须检查具有单个值的最后一个案例作为单独的案例:

>>> import regex
>>> m = regex.match(r"(?:(?P<digits>\d+).(?P<digits>\d+))", "88-94")
>>> a = m.captures("digits")
>>> a
['88', '94']
>>> m = regex.match(r"(?:(?P<digits>\d+).(?P<digits>\d+))", "88 94")
>>> a = m.captures("digits")
>>> a
['88', '94']
>>> range(int(a[0]), int(a[1])+1)
[88, 89, 90, 91, 92, 93, 94]
>>> 

这是一个解决上述案例但是88-94,96-98等的解决方案

>>> import re
>>> a = map(int, re.findall(r'\d+', '88-94'))
>>> range(a[0], a[-1]+1)
[88, 89, 90, 91, 92, 93, 94]
>>> a = map(int, re.findall(r'\d+', '88 94'))
>>> range(a[0], a[-1]+1)
[88, 89, 90, 91, 92, 93, 94]
>>> a = map(int, re.findall(r'\d+', '88'))
>>> range(a[0], a[-1]+1)
[88]
>>> 

几乎涵盖所有案例的解决方案:

>>> import re
>>> a = map(int, re.findall(r'\d+', '88-94, 96-108'))
>>> c = zip(a[::2], a[1::2])
>>> [m for k in [range(i,j+1) for i, j in c] for m in k]
[88, 89, 90, 91, 92, 93, 94, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108]
>>> a = map(int, re.findall(r'\d+', '88-94, 96-108, 125 129'))
>>> c = zip(a[::2], a[1::2])
>>> [m for k in [range(i,j+1) for i, j in c] for m in k]
[88, 89, 90, 91, 92, 93, 94, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 125, 126, 127, 128, 129]
>>> a = map(int, re.findall(r'\d+', '88-94, 96-108, 125 129, 132 to 136'))
>>> c = zip(a[::2], a[1::2])
>>> [m for k in [range(i,j+1) for i, j in c] for m in k]
[88, 89, 90, 91, 92, 93, 94, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 125, 126, 127, 128, 129, 132, 133, 134, 135, 136]
>>> 

任何人都可以建议理由进行投票或投票吗?

任何帮助将不胜感激,任何人都可以建议如何更新问题我不是要求替代解决方案,因为我知道如何拆分和循环甚至重新剥离数字和循环,我的问题是如何在单一的重做声明如果可能?答案可能只是偏离主题而不是问题。

2 个答案:

答案 0 :(得分:1)

range(*map(int,mystring.split("-")))

不需要正则表达式

答案 1 :(得分:1)

import re

def get_numbers(value):
    value = re.sub(r'^(\d+)$', r'\1-\1', value) # '88' -> '88-88'
    start, stop = map(int, re.findall(r'\d+', value))
    return range(start, stop+1)

print get_numbers('88-94')
print get_numbers('88 to 94')
print get_numbers('88')

输出:

[88, 89, 90, 91, 92, 93, 94]
[88, 89, 90, 91, 92, 93, 94]
[88]