我正在尝试匹配RegEx命名组( preArgs , apm1Args , midArgs , apm2Args , postArgs ),以随机顺序出现 我可以匹配测试字符串1 ,但不能匹配下面的测试字符串2 :
我需要满足以下要求:
1。 每个组可能存在1个或更多(因为剩余的垃圾);或者缺席完全......
2。 除了唯一的javaagent jar之外, apm1Args 和 apm2args 中的每一个始终都会显示一个或多个-D开关。
我尝试了一些OR(|)选项,(?=)积极向前看,但没有运气而迷失在迷宫中...... 我的试验:
RegEx (可从RegEx listed at regex101.com获得)
^(?P<preArgs>.*)(?P<apm1Args>-javaagent:.+\/agent1\.jar\s+(?:-Dvendor1\.agent1\.\S+\s*)*)(?P<midArgs>.*)(?P<apm2Args>-javaagent:.+\/agent2\.jar\s+(?:-Dvendor2\.agent2\.\S+\s*)*)(?P<postArgs>.*)$
测试字符串1
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2
测试字符串2 (可从以下网址获取:same RegEx with a different regex101.com link)
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon
更新
我最终在python中使用'循环'方法来清理以随机顺序显示或根本不显示的'apmArgs'组。以下是我的代码段(也可在repl.it进行测试)
import os, sys, re
regExArr=[
'(?P<preArgs>.*)(?P<apmArgs>-javaagent:\s*\/\S+agent1\.jar\s+(?:-Dvendor1\.agent1\.\S+\s*)*)(?P<postArgs>.*)'
,'(?P<preArgs>.*)(?P<apmArgs>-javaagent:\s*\/\S+agent2\.jar\s+(?:-Dvendor2\.agent2\.\S+\s*)*)(?P<postArgs>.*)'
]
testStrList=[
'-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon'
]
newApmArgs='-javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13'
for i, testStr in enumerate(testStrList):
for regEx in regExArr:
matchedArgs = re.search(regEx,testStr)
while matchedArgs:
print "matchedArgs found count:", len(matchedArgs.groups())
print "matchedArgs found:\n", matchedArgs.groups()
#ignore any <apmArgs> group and concatenate other groups
testStr=(matchedArgs.group('preArgs').strip()+' '+matchedArgs.group('postArgs').strip()).strip()
#check further for leftover <apmArgs> and repeat the clean-up
matchedArgs = re.search(regEx,testStr)
testStrList[i] = testStr + ' ' + newApmArgs
print "cleaned up list testStrList that had Random groups of APM Args Text (now appended with 3rd type APM Args) is:\n", testStrList
输出:
matchedArgs found count: 3
matchedArgs found:
('-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2')
matchedArgs found count: 3
matchedArgs found:
('', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2')
matchedArgs found count: 3
matchedArgs found:
('-Xgcpolicy:gencon ', '-javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2', '')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1', '')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 ', '-javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 ', '-Xgcpolicy:gencon')
cleaned up list testStrList that had Random groups of APM Args Text (now appended with 3rd type APM Args) is:
['-Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13']
答案 0 :(得分:0)
你可能会发现一种pyparsing方法可以让你更快地进行正则表达式争论。这是一个将处理两个测试字符串的解析器:
import pyparsing as pp
# just some punctuation
COLON,EQ = map(pp.Suppress, ':=')
# expressions for key=value,... switches
subkey = pp.Word(pp.alphas)
subvalue = pp.pyparsing_common.integer | pp.Word(pp.printables, excludeChars=',')
key_value_list = pp.Dict(pp.delimitedList(pp.Group(subkey + EQ + subvalue)))
# parse switches
switch_key = pp.Word('-', pp.alphas).setParseAction(lambda t: t[0][1:].lower())
switch_value = key_value_list | subvalue
switch = switch_key + pp.Optional(COLON + switch_value)
# -D definitions
java_path_name = pp.delimitedList(pp.pyparsing_common.identifier, delim='.', combine=True)
defn = (pp.Suppress("-D") + java_path_name.leaveWhitespace()
+ EQ.leaveWhitespace()
+ pp.Optional(subvalue().leaveWhitespace()))
# define parser for the entire line - use Dict class to define dynamic key-value structures instead of just 2-tuples
parser = pp.Dict(pp.OneOrMore(pp.Group(defn | switch)))
tests = """\
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon
"""
parser.runTests(tests)
打印:
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2
[['xdebug'], ['xnoagent'], ['xrunjdwp', ['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]], ['javaagent', '/path1/to/agent1.jar'], ['vendor1.agent1.applicationName', 'app123'], ['vendor1.agent1.tierName', 'myTier1'], ['vendor1.agent1.nodeName'], ['vendor1.agent1.uniqueHostId', 'myHost1'], ['xgcpolicy', 'gencon'], ['javaagent', '/path2/to/vendor2/agent2.jar'], ['vendor2.agent2.agentProfile', '/metlife/runtime/installed/apm/profiles/csa.profile'], ['vendor2.agent2.customValue1', 'myValue2']]
- javaagent: '/path2/to/vendor2/agent2.jar'
- vendor1.agent1.applicationName: 'app123'
- vendor1.agent1.nodeName: ''
- vendor1.agent1.tierName: 'myTier1'
- vendor1.agent1.uniqueHostId: 'myHost1'
- vendor2.agent2.agentProfile: '/metlife/runtime/installed/apm/profiles/csa.profile'
- vendor2.agent2.customValue1: 'myValue2'
- xdebug: ''
- xgcpolicy: 'gencon'
- xnoagent: ''
- xrunjdwp: [['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]]
- address: 7777
- server: 'y'
- suspend: 'y'
- transport: 'dt_socket'
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon
[['xdebug'], ['xnoagent'], ['xrunjdwp', ['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]], ['javaagent', '/path2/to/vendor2/agent2.jar'], ['vendor2.agent2.agentProfile', '/metlife/runtime/installed/apm/profiles/csa.profile'], ['vendor2.agent2.customValue1', 'myValue2'], ['javaagent', '/path1/to/agent1.jar'], ['vendor1.agent1.applicationName', 'app123'], ['vendor1.agent1.tierName', 'myTier1'], ['vendor1.agent1.nodeName'], ['vendor1.agent1.uniqueHostId', 'myHost1'], ['xgcpolicy', 'gencon']]
- javaagent: '/path1/to/agent1.jar'
- vendor1.agent1.applicationName: 'app123'
- vendor1.agent1.nodeName: ''
- vendor1.agent1.tierName: 'myTier1'
- vendor1.agent1.uniqueHostId: 'myHost1'
- vendor2.agent2.agentProfile: '/metlife/runtime/installed/apm/profiles/csa.profile'
- vendor2.agent2.customValue1: 'myValue2'
- xdebug: ''
- xgcpolicy: 'gencon'
- xnoagent: ''
- xrunjdwp: [['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]]
- address: 7777
- server: 'y'
- suspend: 'y'
- transport: 'dt_socket'
以下是访问已解析字段的示例代码:
t0 = tests.splitlines()[0]
result = parser.parseString(t0)
print(result.xrunjdwp.address)
print(result['vendor1.agent1.applicationName'])
打印:
7777
app123