我有这两种字符串匹配和分组:
<133>[S=88121248] [SID:1073710562] ( lgr_psbrdif)(72811810 ) #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED
和
<133>[S=88209541] ( sip_stack)(73281971 ) TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection
我需要匹配两者并获得特定组。我使用这种模式:
<(.*)>\[S=(.*)\] (\[SID:(.*?)\])?(.*)
我匹配的是:
Match0: <133>[S=88121248] [SID:1073710562] ......the full line
Group1: 133
Group2: 88121248] [SID:1073710562
Group3:
Group4:
Group5: ......the full line
Match1: <133>[S=88209541] ......the full line
Group1: 133
Group2: 88209541
Group3:
Group4:
Group5: ......the full line
我需要什么:
Match0: <133>[S=88121248] [SID:1073710562] ......the full line
Group1: 133
Group2: 88121248
Group3: 1073710562
Group4:
Group5: ......the full line
Match1: <133>[S=88209541] ......the full line
Group1: 133
Group2: 88209541
Group3:
Group4:
Group5: ......the full line
要恢复两者的匹配都没问题,但分组不是。第二个字符串匹配并分组正常,但第一个字符串没有。
答案 0 :(得分:2)
你使用贪婪的明星.*
犯了一个典型的错误,从而超出你想要的比赛。
要匹配两个分隔符之间的任何内容,最好使用否定字符类,例如<([^>]*)>
和<
之间的>
。
所以这会奏效:
^<([^>]*)>\[S=([^\]]*)\]\s+(?:\[SID:([^\]]*)\]\s+)?(.*)
故障:
^<([^>]*)> # something between < and > at the start of the line
\[S=([^\]]*)\]\s+ # something between "[S=" and "]"
(?:\[SID:([^\]]*)\]\s+)? # something between "[SID:" and "]", optional
(.*) # rest of the string
请注意非捕获括号(?:...)
,以消除结果中未使用的组。
匹配
MATCH 1
1. [1-4] `133`
2. [8-16] `88121248`
3. [23-33] `1073710562`
4. [35-218] `( lgr_psbrdif)(72811810 ) #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED `
MATCH 2
1. [220-223] `133`
2. [227-235] `88209541`
3. n/a
4. [237-360] `( sip_stack)(73281971 ) TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection `