Python - 用逗号分割,跳过括号内的内容

时间:2015-11-04 16:40:00

标签: python regex replace

我需要用逗号分隔字符串,但我遇到这种情况的问题:

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD

我想分手并获得:

var[0] = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))"
var[1] = "SECOND"
var[2] = "THIRD"

谢谢

4 个答案:

答案 0 :(得分:3)

这是一个非常简单的解析器方法,适用于您的示例:

def top_level_split(s):
    """
    Split `s` by top-level commas only. Commas within parentheses are ignored.
    """

    # Parse the string tracking whether the current character is within
    # parentheses.
    balance = 0
    parts = []
    part = ''

    for c in s:
        part += c
        if c == '(':
            balance += 1
        elif c == ')':
            balance -= 1
        elif c == ',' and balance == 0:
            parts.append(part[:-1].strip())
            part = ''

    # Capture last part
    if len(part):
        parts.append(part.strip())

    return parts

my_list = top_level_split("TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD")
print(my_list)

答案 1 :(得分:3)

你可以使用这种基于正则表达式的负前瞻:

,(?!(?:[^(]*\([^)]*\))*[^()]*\))

这个正则表达式找到一个带有断言的逗号,确保逗号不在括号中。这是使用负向前瞻完成的,首先消耗所有匹配的(),然后是)这假定括号是平衡的和未转义的

RegEx Demo

<强>代码:

>>> s = 'TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)

['TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']

或者:

>>> s = 'TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
>>> print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)
['TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']

答案 2 :(得分:1)

感谢 jonrsharpe

text = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD"
array = re.split(r',(?!.*\))', text)
for item in array:
    # Print and remove the first space
    print item.strip(" ")

结果:

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD

答案 3 :(得分:-2)

您可以使用rsplit

l1 = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD".rsplit(",", 2)

for line in l1:
   print line

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD