如何使用正则表达式将多行文本块解析为dict?

时间:2017-04-16 14:22:46

标签: python regex parsing dictionary

我有这个多行文字:

stack.Frames.Any(x=> x.Target.GetType().UnderlyingSystemType.Name == "FeedbackDialog")

我想使用行首的数字作为键(使用1. fef w fwe fwe fewfa 2. fwa f fwefwfw gw 2 2f 23. f g gegwg 32. gre34 g3 1. gr egsg .作为分隔字符。)
结果必须是:

1 个答案:

答案 0 :(得分:2)

您可以使用此正则表达式:

/^(\d+)\.?\s+(.*?)(?=(?:^\d+\.?)|\Z)/gms

 ^                                       assert start of line
    ^                                    capture 1 or more digits
       ^                                 optional literal . 
           ^                             one or more spaces
               ^                         every character including \n  
                    ^                    lookahead to next block start or end                                 
                                     ^   flags M for multiline and S to have 
                                         dot match all     

Demo

然后你就可以像这样创建dict:

>>> dict(re.findall(r'^(\d+)\.?\s+(.*?)(?=(?:^\d+\.?)|\Z)', s, re.M|re.S))
{'1': 'fef w fwe fwe\nfewfa 2. fwa f\nfwefwfw gw\n', '32': 'gre34 g3 1. gr\negsg', '2': '2f 23. f\ng gegwg\n'}