在python中编写正则表达式

时间:2016-09-09 02:59:35

标签: python regex

我在写正则表达式方面很弱,所以我需要一些帮助。我需要一个与section 7.01匹配然后(a)

的正则表达式

基本上section可以跟6.1 / 7.1 / 2.1

之类的任何数字

示例:

SECTION 7.01. Events of Default. If any of the following events
("Events of Default") shall occur:
          (a) any Borrower shall fail to pay any principal of any Loan when and
     as the same shall become due and payable, whether at the due date thereof
     or at a date fixed for prepayment thereof or otherwise;

我正在尝试编写一个正则表达式,它可以为我提供包含这些

的组

第1组

SECTION 7.01. Events of Default. If any of the following events
("Events of Default") shall occur:

第2组

(a) any Borrower shall fail to pay any principal of any Loan when and
     as the same shall become due and payable, whether at the due date thereof
     or at a date fixed for prepayment thereof or otherwise;

(a) b之后可以有更多积分,等等。

请帮我写一个正则表达式。

3 个答案:

答案 0 :(得分:3)

您可以使用以下方法,但会产生多个假设。节标题必须以SECTION开头,并以冒号:结尾。其次,子部分必须以匹配括号'开头,并以分号结尾。

import re
def extract_groups(s):
    sanitized_string = ''.join(line.strip() for line in s.split('\n'))
    sections = re.findall(r'SECTION.*?:', sanitized_string)
    sub_sections = re.findall(r'\([a-z]\).*?;', sanitized_string)
    return sections, sub_sections

示例输出:

>>> s = """SECTION 7.01. Events of Default. If any of the following events
("Events of Default") shall occur:
          (a) Whether at the due date thereof
     or at a date fixed for prepayment thereof or otherwise;

          (b) Test;
SECTION 7.02. Second section:"""
>>> print extract_groups(s)
(['SECTION 7.01. Events of Default. If any of the following events("Events of Default") shall occur:', 'SECTION 7.02. Second section:'], 
['(a) Whether at the due date thereofor at a date fixed for prepayment thereof or otherwise;', '(b) Test;'])

答案 1 :(得分:0)

我让这个工作:

s = """
SECTION 7.01. Events of Default. If any of the following events
("Events of Default") shall occur:
          (a) any Borrower shall fail to pay any principal of any Loan when and
     as the same shall become due and payable, whether at the due date thereof
     or at a date fixed for prepayment thereof or otherwise;
"""

r = r'(SECTION 7\.01\.[\s\w\.()"]*:)[\s]*(\(a\)[\s\w,]*;)'
mo = re.search(r, s)
print('Group 1: ' + mo.group(1))
print('Group 2: ' + mo.group(2))

如果您想将其设为通用,那么您可以抓取任意数字或部分,您可以尝试:

r = r'(SECTION [1-9]\.[0-9]{2}\.[\s\w\.()"]*:)[\s]*(\([a-z]\)[\s\w,]*;)'

答案 2 :(得分:0)

为了帮助您学习,如果您必须编写另一套正则表达式,我建议您查看下面的文档: https://docs.python.org/3/howto/regex.html#regex-howto

这是" easy" python正则表达式的介绍。从本质上讲,您将定义一个模式,并使用上面的链接作为参考来根据需要构建模式。然后,调用模式将其应用于任何需要处理。