Python复杂的正则表达式字符串扩展

时间:2013-11-18 00:18:43

标签: python regex string substring

假设我有一个以下形式的字符串:

ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)

我希望能够将其扩展为:

ABCDEF_0A_GHIJ_A
ABCDEF_1A_GHIJ_A
ABCDEF_2A_GHIJ_A
...
ABCDEF_100A_GHIJ_A

ABCDEF_0B_GHIJ_A
ABCDEF_1B_GHIJ_A
ABCDEF_2B_GHIJ_A
...
ABCDEF_100B_GHIJ_A

ABCDEF_0A_GHIJ_B
ABCDEF_1A_GHIJ_B
ABCDEF_2A_GHIJ_B
...
ABCDEF_100A_GHIJ_B

ABCDEF_0B_GHIJ_B
ABCDEF_1B_GHIJ_B
ABCDEF_2B_GHIJ_B
...
ABCDEF_100B_GHIJ_B

ABCDEF_0A_GHIJ_C
ABCDEF_1A_GHIJ_C
ABCDEF_2A_GHIJ_C
...
ABCDEF_100A_GHIJ_C

..and so on

第二行的字符串是:

STRING_(START-END;INC)_STRING(A OR B)_STRING(A THRU F)

但是,正则表达式在字符串中可以是任何地方。即字符串也可以是:

ABCDEF_(A|B)_(0-100;1)_(A-F)_GHIJ

这是我到目前为止所尝试的内容:

trend = 'ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)'

def expandDash(trend):
    dashCount = trend.count("-")
    for dC in range(0, dashCount):
        dashIndex = trend.index("-")-1
        trendRange = trend[dashIndex:]
        bareTrend = trend[0:trend.index("(")]
        beginRange = trendRange[0:trendRange.index("-")]
        endRange = trendRange[trendRange.index("-"):trendRange.index(";")]
        trendIncrement = trendRange[-1]
        expandedTrendList = []


def regexExpand(trend):

    for regexTrend in trend.split(')'):
        if "-" in regexTrend:
            print trend
            expandDash(regexTrend)

我显然被困在这里......

有没有简单的方法可以使用REGEX进行字符串扩展?

1 个答案:

答案 0 :(得分:1)

您可以使用正则表达式轻松地解析您的迷你表达式语言。但你不能使用正则表达式实际进行扩展:

TREND_REGEX = re.compile('(^.*?)(?:\((?:([^-)])-([^)])|(\d+)-(\d+);(\d+)|([^)|]+(?:\|[^)|]+)*))\)(.*))?$')

def expand(trend):
    m = TREND_REGEX.match(trend)
    if m.group(8):
        suffixes = expand(m.group(8))
    else:
        suffixes = ['']
    if m.group(2):
        for z in suffixes:
            for i in range(ord(m.group(2)), ord(m.group(3))+1):
                yield m.group(1) + chr(i) + z
    elif m.group(4):
        for z in suffixes:
            for i in range(int(m.group(4)), int(m.group(5))+1, int(m.group(6))):
                yield m.group(1) + str(i) + z
    elif m.group(7):
        for z in suffixes:
            for s in m.group(7).split('|'):
                yield m.group(1) + s + z
    else:
        yield trend