是否有一种技术使用正则表达式来确保如果模式出现n次,那么其他地方是否会出现不同的模式?
例如......我只想说我有一个看起来像这样的字符串
ppppjksdfjlsdkjfnnnn
pppppjksdfjlsdkjfnnnnn
ppppjksdfjlsdkjfnnn
我的正则表达式看起来像p*.*n*
,但我希望前两个匹配我的正则表达式但最后一个不匹配,因为在第一个和第二个中,#p's = #n's。
编辑:
应该注意ps和ns的#没有理论上限,但实际上它是~50-100。此外,试图把它放在一个stylechecker,所以纯正的正则表达式是必要的。我理解你如何在算法上做到这一点。
另外一些可能有用的是,在应用中,p和n共享一个属性......它更像是
p(m1)p(m2)p(m3)jfkds(d1)f(m)ljsdn(m1)n(m2)n(m3)
我最接近的是在失败时使用负向前瞻匹配:
p\((m[0-9])\).*(?!n\(\1\))
当正则表达式不匹配时失败,但仅当n
的数量小于p
时才会失败。但效率也非常低,并且可能导致大输入上的堆栈溢出。
答案 0 :(得分:1)
因为这是Java,我可以看到你通过脚本构建Regex,具体取决于源数据的复杂程度。在一个简单的层面上,正则表达式看起来像这样:
^(p(?!p)(.+?)(?<!n)n|pp(?!p)(.+?)(?<!n)nn|ppp(?!p)(.+?)(?<!n)nnn|pppp(?!p)(.+?)(?<!n)nnnn|ppppp(?!p)(.+?)(?<!n)nnnnn)$
这里正则表达式使用简单的替换来强制执行您要查找的字母数量,并且要求中间有一些非p
和非n
。
您可以将(.+?)
结构替换为(.*?)
,以仅允许前导p
并且尾随n
而不包含中心子字符串。
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
.+? any character except \n (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
pp 'pp'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
.+? any character except \n (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
nn 'nn'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
ppp 'ppp'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \4:
--------------------------------------------------------------------------------
.+? any character except \n (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \4
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
nnn 'nnn'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
pppp 'pppp'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \5:
--------------------------------------------------------------------------------
.+? any character except \n (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \5
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
nnnn 'nnnn'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
ppppp 'ppppp'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \6:
--------------------------------------------------------------------------------
.+? any character except \n (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \6
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
nnnnn 'nnnnn'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
答案 1 :(得分:0)
我宁愿使用更简单的方法解决它:
public static void main(String[] args) {
final String input1 = "ppppjksdfjlnnsdkjfnnnn";
final String input2 = "pppppjksdfppnjlsdkjfnnnnn";
final String input3 = "ppppjksdfjnlsdkjfnnn";
System.out.println(input1 + ": " + getAnswer(input1));
System.out.println(input2 + ": " + getAnswer(input2));
System.out.println(input3 + ": " + getAnswer(input3));
}
public static int findAll(String pattern, String input) {
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(input);
int count = 0;
while (m.find()) {
count = m.group(0).length();
break;
}
return count;
}
public static boolean getAnswer(String input) {
int number_p = findAll("^p*", input);
int number_n = findAll("n*$", input);
return number_p == number_n;
}
这是输出:
ppppjksdfjlnnsdkjfnnnn: true
pppppjksdfppnjlsdkjfnnnnn: true
ppppjksdfjnlsdkjfnnn: false
答案 2 :(得分:0)
您的问题是寻求在Java风格的Regex中支持不的递归。
通常我会看到这个问题带有平衡的开括号和右括号。如果您使用的是PCRE风格,那么递归正则表达式将如下所示:
p((?>[^()]+)|(?R))*n