python按'和'和'或'拆分,但不在括号中

时间:2017-08-01 05:34:21

标签: python regex

我有以下字符串:

  

(某些文字)或((其他文字)和(更多文字))和(更多文字)

我想要一个将其分成

的python正则表达式
['(some text)', '((other text) and (some more text))', '(still more text)']

我试过这个,但它不起作用:

haystack = "(some text) or ((other text) and (some more text)) and (still more text)"
re.split('(or|and)(?![^(]*.\))', haystack) # no worky

感谢任何帮助。

4 个答案:

答案 0 :(得分:2)

此解决方案适用于任意嵌套的括号,正则表达式不能(s是原始字符串):

from pyparsing import nestedExpr
def lst_to_parens(elt):
    if isinstance(elt,list):
        return '(' + ' '.join(lst_to_parens(e) for e in elt) + ')'
    else:
        return elt

split = nestedExpr('(',')').parseString('(' + s + ')').asList()
split_lists = [elt for elt in split[0] if isinstance(elt,list)]
print ([lst_to_parens(elt) for elt in split_lists])

输出:

['(some text)', '((other text) and (some more text))', '(still more text)']

对于OP的真实测试用例:

s = "(substringof('needle',name)) or ((role eq 'needle') and (substringof('needle',email))) or (job eq 'needle') or (office eq 'needle')"

输出:

["(substringof ('needle' ,name))", "((role eq 'needle') and (substringof ('needle' ,email)))", "(job eq 'needle')", "(office eq 'needle')"]

答案 1 :(得分:1)

我会使用re.findall代替re.split。请注意,这仅适用于深度2的括号。

>>> import re
>>> s = '(some text) or ((other text) and (some more text)) and (still more text)'
>>> re.findall(r'\((?:\((?:\([^()]*\)|[^()]*)*\)|[^()])*\)', s)
['(some text)', '((other text) and (some more text))', '(still more text)']
>>> 

答案 2 :(得分:1)

您也可以查看

import re
s = '(some text) or ((other text) and (some more text)) and (still more text)'
find_string = re.findall(r'[(]{2}[a-z\s()]*[)]{2}|[(][a-z\s]*[)]', s)
print(find_string)

输出:

['(some text)', '((other text) and (some more text))', '(still more text)']

修改

find_string = re.findall(r'[(\s]{2}[a-z\s()]*[)\s]{2}|[(][a-z\s]*[)]', s)

答案 3 :(得分:0)

你可以试试这个     re.split(' [a-f] +',' 0a3B9',flags = re.IGNORECASE)