我对Python和正则表达式都很陌生,所以请耐心等待。我有一些看起来像这样的文字:
Change 421387 on 2011/09/20 by person@domain.com Some random text including line breaks Change 421388 on 2011/09/20 by person2@domain.com Some other random text including line breaks
现在,我想使用python和正则表达式将其拆分为元组。最后,我希望元组包含两个元素。
元素0:
Change 421387 on 2011/09/20 by person@domain.com Some random text including line breaks
元素1:
Change 421388 on 2011/09/20 by person2@domain.com Some other random text including line breaks
我意识到我可以使用正则表达式来识别由以下形成的模式:
@
我知道它可以进一步细分,但我认为识别这些东西足以达到我的目的。
一旦我想出了该模式的正则表达式,我怎样才能用它将字符串拆分成字符串元组?
答案 0 :(得分:4)
有一个先行断言。
>>> re.split(r'(?=\s+Change \d+ on \d{4})\s+', ''' Change 421387 on 2011/09/20 by person@domain.com
... Some random text including line breaks
... Change 421388 on 2011/09/20 by person2@domain.com
... Some other random text including line breaks''')
['', 'Change 421387 on 2011/09/20 by person@domain.com\n Some random text including line breaks', 'Change 421388 on 2011/09/20 by person2@domain.com\n Some other random text including line breaks']