RegEx:Muliple Key =逃脱的价值

时间:2017-05-10 10:47:49

标签: regex

我需要帮助来制作键值解析器。

谢谢先生。帖子RegEx: extract Key=Value pairs with Escape \=中的@VMRuiz 他建议这个RegEx:

\s*(\w+)\s*=\s*(\w+|<.*?>|\w+\s*\\=\s*\w+)\s*

但是我发现一些不起作用的场景: 应该有所帮助,但很少有情景不适用于该正则表达式:

app=tcp/444
# Catchs only Key:app Value:tcp > should catch Key:app Value:tcp/444 

catdt=Network-based 
# Current result: 
#   catdt:'Network'
#   
# Shoud be:
#   catdt:'Network-based' 

eventId=123123 externalId=11111
# Current result:
#   eventId:'123123 externalId=11111'
#
# Should catch
#  eventId: '123123'
#  externalId: '111111'

src=2.3.4.5
# Current result:
#   src:'2'
#
# Should catch
#  src: '2.3.4.5'

eventAnnotationEndTime=1493293598\=aaa00
# Should be:
#   eventAnnotationEndTime: '1493293598\=aaa00'

eventAnnotationEndTimeA=1493293598A\=aaa01 eventAnnotationEndTimeB=1493293598\=aaa02
# Should be:
#   eventAnnotationEndTimeA: '1493293598\=aaa01'
#   eventAnnotationEndTimeB: '1493293598\=aaa02'


sourceTranslatedZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 172.3.0.0-172.3.255.255
# Should be:
#   ourceTranslatedZoneURI: '/All Zones/ArcSight System'

有些时候我\=应该逃脱,这不是关键值(参见示例) 有时我在同一行上有几个KeyValue对

我需要提取键值对的存根方案列表:

eventId=47539272657 externalId=19260037
mrt=124412421
app=tcp/444
proto=TCP
in=51485
out=3125
catdt=Network-based 
modelConfidence=0
severity=0 relevance=10 assetCriticality=0
priority=3
art=124
cat=traffic:forward
deviceSeverity=3
rt=234124
shost=bzq-194et
src=1.1.1.227
sourceZoneID=Mokee5CcBABCGKZ5Updd27g\=\=
sourceZoneURI=/All Zones/ArcSight System/Public Address Space Zones/RIPE NCC/193.0.0.0-195.255.255.255 (RIPE NCC)
sourceTranslatedAddress=12.6.4.5
sourceTranslatedZoneID=Mbp432AABABCDUVpYAT3UdQ\=\=
sourceTranslatedZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 172.3.0.0-172.3.255.255
sourceTranslatedZoneExternalID=RFC1918: 172.3.0.0-172.3.255.255
spt=17743
sourceTranslatedPort=87878
dst=1.1.3.5
destinationZoneID=Mbp432AABABCDUVp77YAT3UdQ\=\=
destinationZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 172.3.0.0-172.31.3.255
destinationZoneExternalID=RFC1918: 172.16.0.0-172.31.255.255
dpt=444
cs1=forward
cs5=close
locality=1
cs1Label=SubType
cs2Label=Attribute
cs3Label=User
cs4Label=Path
cs5Label=Action
ahost=arc-77
agt=1.3.4.3
av=5.3.5.5973.0
atz=Asia/778
aid=DvLMkV77rYkaWDEA\=\=
at=sup7nt
dvchost=FWAZURE-B
dtz=Asia/778
deviceInboundInterface=port1
deviceOutboundInterface=port2
eventAnnotationStageID=R9MHiNfoAAxxcBCASAsxbPIxG0g\=\=
eventAnnotationStageURI=/All Stages/Queued
eventAnnotationStageUpdateTime=123123123
eventAnnotationModificationTime=11123123
eventAnnotationAuditTrail=1,1491s9,root,Queued,,,,\n
eventAnnotationVersion=1
eventAnnotationEventId=44423124
eventAnnotationFlags=0
eventAnnotationEndTime=1212312
eventAnnotationManagerReceiptTime=32323532
_cefVer=0.1 ad.
arcSightEventPath=3xZdnIloBABDH14iZHcPHvw\=\=

2 个答案:

答案 0 :(得分:0)

这样的东西可能就是你要找的东西:

\s*(\w+)\s*=\s*((?:\\.|[\w.,\/:()-]|\s(?!\w+\s*=))*)

匹配/捕获密钥并匹配以下=。然后它捕获 -

  • 转义字符(\后跟任何字符)或......
  • 班级中的任何角色,或最后......
  • 空格,,然后是新密钥。

(你还没有标记正则表达式/语言,所以我假设P​​CRE兼容)

它不会处理“评论”,因此必须先将其过滤掉。

See it here at regex101

答案 1 :(得分:0)

这个应该有效:^(?P<KEY>[^=#\s]+)=(?P<VALUE>.*)$

或没有命名组:^([^=#\s]+)=(.*)$

Demo