基于Python中的分隔符拆分和组合文本

时间:2015-03-12 01:26:45

标签: python string list text

我有一个包含字符串的列表列表。在经常使用各种正则表达式之后,我已将我想要用作分隔符@@@的内容插入到我的字符串中:

[['@@@this is part one and here is part two and here is part three and heres more and heres more'],
 ['this is part one@@@and here is part two and here is part three and heres more and heres more'],
 ['this is part one and here is part two@@@and here is part three and heres more and heres more']
 ['this is part one and here is part two and here is part three@@@and heres more and heres more']
 ['this is part one and here is part two and here is part three and heres more@@@and heres more']]

现在,我需要提出这个问题:

[['this is part one'],['and here is part two'],['and here is part three'], ['and heres more'], ['and heres more']]  

到目前为止,我的尝试都是臃肿,黑客,而且一般都很难看。我发现自己分裂,组合和匹配。任何人都可以就这类问题推荐一些一般性建议,以及使用哪些工具来保持其可管理性?

编辑请注意! and heres more确实在理想输出中出现两次!

1 个答案:

答案 0 :(得分:1)

我认为你实际上需要抓住@@@之后到下一个and或字符串结尾的所有字符。

>>> [[m] for x in l for m in re.findall(r'@@@(.*?)(?=\sand\b|$)', x[0])]
[['this is part one'], ['and here is part two'], ['and here is part three'], ['and heres more'], ['and heres more']]