我正在做一些正规表达体操。我为自己设置了尝试搜索C#代码的任务,其中使用了as-operator而没有在合理的空间内进行空检查。现在我不想解析C#代码。例如。我想捕获代码片段,例如
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
然而,不捕捉
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1 == null)
也不是那件事
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(somethingunrelated == null) {...}
if(x1.a == y1.a)
因此,任何随机空检查都将被视为“良好检查”,因此未找到。
问题是:我如何匹配某些内容,同时确保在其周围环境中找不到其他内容。
我尝试过天真的方法,寻找'as'然后在150个字符内做一个负面的预测。
\bas\b.{1,150}(?!\b==\s*null\b)
上述正则表达式与所有上述示例相匹配。我的直觉告诉我,问题是前瞻然后做负面预测会发现许多情况,即前瞻没有找到'== null'。
如果我尝试否定整个表达式,那么这也无济于事,因为它与大多数C#代码相匹配。
答案 0 :(得分:11)
我爱正则表演体操!这是一个注释的PHP正则表达式:
$re = '/# Find all AS, (but not preceding a XX == null).
\bas\b # Match "as"
(?= # But only if...
(?: # there exist from 1-150
[\S\s] # chars, each of which
(?!==\s*null) # are NOT preceding "=NULL"
){1,150}? # (and do this lazily)
(?: # We are done when either
(?= # we have reached
==\s*(?!null) # a non NULL conditional
) #
| $ # or the end of string.
)
)/ix'
这里是Javascript风格:
re = /\bas\b(?=(?:[\S\s](?!==\s*null)){1,150}?(?:(?===\s*(?!null))|$))/ig;
这个确实让我头疼了......
以下是我正在使用的测试数据:
text = r""" var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
however, not capture
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1 == null)
nor for that matter
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(somethingunrelated == null) {...}
if(x1.a == y1.a)"""
答案 1 :(得分:2)
将.{1,150}
放在前瞻中,并将.
替换为\s\S
(通常,.
与换行符不匹配)。此外,\b
可能会误导==
附近。
\bas\b(?![\s\S]{1,150}==\s*null\b)
答案 2 :(得分:2)
我认为将变量名称放入()会有所帮助,因此可以将其用作后向引用。如下所示,
\b(\w+)\b\W*=\W*\w*\W*\bas\b[\s\S]{1,150}(?!\b\1\b\W*==\W*\bnull\b)
答案 3 :(得分:2)
问题不明确。你想要什么?我很遗憾,但在阅读了很多次的问题和评论后,我仍然不明白。
代码必须在C#中吗?在Python?其他?关于这一点没有任何迹象
只有当if(... == ...)
行跟在var ... = ...
行后面时,您才想要匹配吗?
或者,在不停止匹配的情况下,可以在块和if(... == ...)
行之间使用异类线?
我的代码将第二个选项设为true。
if(... == null)
行之后的if(... == ...)
行是否会停止匹配?
无法理解是否为是,我定义了两个正则表达式以捕获这两个选项。
我希望我的代码足够清晰并回答您的当务之急。
是在Python中
import re
ch1 ='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
1618987987849891
'''
ch2 ='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
uydtdrdutdutrr
if(x1.a == y1.a)
3213546878'''
ch3='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1 == null)
165478964654456454'''
ch4='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
hgyrtdduihudgug
if(x1 == null)
165489746+54646544'''
ch5='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(somethingunrelated == null ) {...}
if(x1.a == y1.a)
1354687897'''
ch6='''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
ifughobviudyhogiuvyhoiuhoiv
if(somethingunrelated == null ) {...}
if(x1.a == y1.a)
2468748874897498749874897'''
ch7 = '''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
iufxresguygo
liygcygfuihoiuguyg
if(somethingunrelated == null ) {...}
oufxsyrtuy
'''
ch8 = '''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
tfsezfuytfyfy
if(x1.a == y1.a)
iufxresguygo
liygcygfuihoiuguyg
if(somethingunrelated == null ) {...}
oufxsyrtuy
'''
ch9 = '''kutgdfxfovuyfuuff
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
tfsezfuytfyfy
if(x1.a == y1.a)
if(somethingunrelated == null ) {...}
oufxsyrtuy
'''
pat1 = re.compile(('('
'(^var +\S+ *= *\S+ +as .+[\r\n]+)+?'
'([\s\S](?!==\s*null\\b))*?'
'^if *\( *[^\s=]+ *==(?!\s*null).+$'
')'
),
re.MULTILINE)
pat2 = re.compile(('('
'(^var +\S+ *= *\S+ +as .+[\r\n]+)+?'
'([\s\S](?!==\s*null\\b))*?'
'^if *\( *[^\s=]+ *==(?!\s*null).+$'
')'
'(?![\s\S]{0,150}==)'
),
re.MULTILINE)
for ch in (ch1,ch2,ch3,ch4,ch5,ch6,ch7,ch8,ch9):
print pat1.search(ch).group() if pat1.search(ch) else pat1.search(ch)
print
print pat2.search(ch).group() if pat2.search(ch) else pat2.search(ch)
print '-----------------------------------------'
结果
>>>
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
-----------------------------------------
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
uydtdrdutdutrr
if(x1.a == y1.a)
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
uydtdrdutdutrr
if(x1.a == y1.a)
-----------------------------------------
None
None
-----------------------------------------
None
None
-----------------------------------------
None
None
-----------------------------------------
None
None
-----------------------------------------
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
if(x1.a == y1.a)
None
-----------------------------------------
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
tfsezfuytfyfy
if(x1.a == y1.a)
None
-----------------------------------------
var x1 = x as SimpleRes;
var y1 = y as SimpleRes;
tfsezfuytfyfy
if(x1.a == y1.a)
None
-----------------------------------------
>>>
答案 4 :(得分:2)
让我试着重新定义你的问题:
if (... == null)
超过150个字符,则不匹配if (... == null)
,请匹配由于负面预测,您的表达式\bas\b.{1,150}(?!\b==\s*null\b)
将无效。正则表达式总是可以向前或向后跳过一个字母,以避免这种负面的预测,即使存在if (... == null)
,你最终也会匹配。
正则表达式真的不擅长不匹配的东西。在这种情况下,您最好尝试将“as”赋值与150个字符内的“if == null”匹配匹配:
\bas\b.{1,150}\b==\s*null\b
然后否定支票:if (!regex.match(text)) ...
答案 5 :(得分:1)
(?s:\s+as\s+(?!.{0,150}==\s*null\b))
我正在使用?s:
激活SingleLine选项。如果需要,您可以将它放在正则表达式的选项中。我要补充一点,我将\s
放在as
左右,因为我认为as
周围只有空格是“合法的”。您可以将\b
添加为
(?s:\b+as\b(?!.{0,150}==\s*null\b))
请注意\s
可能会捕获不是“有效空格”的空格。它被定义为[\f\n\r\t\v\x85\p{Z}]
,其中\p{Z}
为Unicode Characters in the 'Separator, Space' Category加Unicode Characters in the 'Separator, Line' Category加Unicode Characters in the 'Separator, Paragraph' Category。