我有一个包含字符串的列表列表。在经常使用各种正则表达式之后,我已将我想要用作分隔符@@@
的内容插入到我的字符串中:
[['@@@this is part one and here is part two and here is part three and heres more and heres more'],
['this is part one@@@and here is part two and here is part three and heres more and heres more'],
['this is part one and here is part two@@@and here is part three and heres more and heres more']
['this is part one and here is part two and here is part three@@@and heres more and heres more']
['this is part one and here is part two and here is part three and heres more@@@and heres more']]
现在,我需要提出这个问题:
[['this is part one'],['and here is part two'],['and here is part three'], ['and heres more'], ['and heres more']]
到目前为止,我的尝试都是臃肿,黑客,而且一般都很难看。我发现自己分裂,组合和匹配。任何人都可以就这类问题推荐一些一般性建议,以及使用哪些工具来保持其可管理性?
编辑请注意! and heres more
确实在理想输出中出现两次!
答案 0 :(得分:1)
我认为你实际上需要抓住@@@
之后到下一个and
或字符串结尾的所有字符。
>>> [[m] for x in l for m in re.findall(r'@@@(.*?)(?=\sand\b|$)', x[0])]
[['this is part one'], ['and here is part two'], ['and here is part three'], ['and heres more'], ['and heres more']]