找到两个字符串模式之间的文本

时间:2018-06-18 07:36:15

标签: python regex

我有这样的日志文本。

data = '''
================================================================================
Annotation file:
/b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/annofile-4977TB53661_cd-builder.xml 

Build ID: 27951 on Host bng-emake-5a.juniper.net, Cluster Manager: bng-ea-cm-02:8030
Start Time: Sun Jun 17 23:57:30 2018


Job ID: J00002adde8677420 , Exit Value 1
CWD: /b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/obj/bsd11/amd64/junos/usr.sbin/rpd/bgp/lib/proto
Node: bng-ea-agent-14a-3  Start: 2018-06-18 00:20:55.488609 (1405.488609)
                          End:   2018-06-18 00:20:56.653945 (1406.653945)

Command:

export BMAKELOCATION=/b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/build/mk/jnx.sym_check.mk:37; CURDIR=/b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/junos/usr.sbin/rpd/bgp/lib/proto  OBJDUMP=/volume/hab/Linux/Ubuntu-12.04/x86_64/llvm/3.7/current/bin/amd64-unknown-freebsd11.0-objdump  BSS_SYMS=bss_syms.clang.amd64,bsdx  TLS_SYMS=tls_syms.clang.amd64,bsdx  ACCEPT_CMD='mk --machine amd64,bsd11 -C junos/usr.sbin/rpd/bgp/lib/proto accept-syms'  /volume/hab/Linux/Ubuntu-12.04/x86_64/bsd-tools/current/bin/sh /b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/build/scripts/sym_check.sh librpd-proto-bgp.a

------------------------------ Output ------------------------------
ERROR: /b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/junos/usr.sbin/rpd/bgp/lib/proto/tls_syms.clang.amd64,bsdx: changed global or static TLS variables
261a262
> bgp_inetsrte_incolor_update._xinfo

If correct, use:

    mk --machine amd64,bsd11 -C junos/usr.sbin/rpd/bgp/lib/proto accept-syms

make[1]: *** [check-syms] Error 1
--------------------------------------------------------------------

Operations:

================================================================================
+ echo -e '\e[31m Displaying Production.log.errs\e[0m'
 Displaying Production.log.errs
+ echo 'End Meta Error logs'
End Meta Error logs
'''

从这个日志文本中,我需要在" ------------------------------输出之间提取文本------------------------------"和" --------------------------------------------- -----------------------"所以输出就是这样。

output = '''
ERROR: /b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/junos/usr.sbin/rpd/bgp/lib/proto/tls_syms.clang.amd64,bsdx: changed global or static TLS variables
261a262
> bgp_inetsrte_incolor_update._xinfo

If correct, use:

    mk --machine amd64,bsd11 -C junos/usr.sbin/rpd/bgp/lib/proto accept-syms

make[1]: *** [check-syms] Error 1
'''

不确定为什么我的下面代码无效。

for result in re.findall(r'------------------------------ Output ------------------------------(.*?) '
                         '--------------------------------------------------------------------', data, re.S):
    output += result

5 个答案:

答案 0 :(得分:2)

这可能会有所帮助。正则表达式Lookbehind & Lookahead

<强>演示:

import re
for result in re.findall(r'(?<=------------------------------ Output ------------------------------)(.*?)(?=--------------------------------------------------------------------)', data, re.S):    
    print(result)

答案 1 :(得分:1)

您无需匹配正则表达式即可。您可以改为使用find

start = "------------------------------ Output ------------------------------"
end = "--------------------------------------------------------------------"

extracted_text = data[data.find(start)+len(start):data.rfind(end)] 

输出将是:

  

错误:   /b/cd-builder/sandboxes/sb_DEV_COMMON_BRANCH-_act-builder-4977/src/junos/usr.sbin/rpd/bgp/lib/proto/tls_syms.clang.amd64,bsdx:   更改了全局或静态TLS变量261a262

     
    

bgp_inetsrte_incolor_update._xinfo

  
     

如果正确,请使用:

mk --machine amd64,bsd11 -C junos/usr.sbin/rpd/bgp/lib/proto accept-syms
     

make [1]:*** [check-syms]错误1

答案 2 :(得分:1)

正则表达式过度使用它。您的开始/停止是唯一的,因此以下方法可行:

start = "------------------------------ Output ------------------------------"
stop = "--------------------------------------------------------------------"
output = log.split(start)[1].split(stop)[0]

答案 3 :(得分:1)

output = ""
for result in re.findall(r'------------------------------ Output ------------------------------(.*?)'
                     '--------------------------------------------------------------------', data, re.S):
output += result

print output

由于(.*?)之后的空格,正则表达式不匹配任何内容。

答案 4 :(得分:-1)

那怎么样?

reg=re.compile(r"-{10,} Output -{10,}(.*?)-{10,}",re.S)
rslt=reg.findall(data)