正则表达式1对1匹配

时间:2017-09-29 20:12:54

标签: java regex

是否有一种技术使用正则表达式来确保如果模式出现n次,那么其他地方是否会出现不同的模式?

例如......我只想说我有一个看起来像这样的字符串

ppppjksdfjlsdkjfnnnn
pppppjksdfjlsdkjfnnnnn
ppppjksdfjlsdkjfnnn

我的正则表达式看起来像p*.*n*,但我希望前两个匹配我的正则表达式但最后一个不匹配,因为在第一个和第二个中,#p's = #n's。

编辑:

应该注意ps和ns的#没有理论上限,但实际上它是~50-100。此外,试图把它放在一个stylechecker,所以纯正的正则表达式是必要的。我理解你如何在算法上做到这一点。

另外一些可能有用的是,在应用中,p和n共享一个属性......它更像是 p(m1)p(m2)p(m3)jfkds(d1)f(m)ljsdn(m1)n(m2)n(m3)

我最接近的是在失败时使用负向前瞻匹配:

p\((m[0-9])\).*(?!n\(\1\))

当正则表达式不匹配时失败,但仅当n的数量小于p时才会失败。但效率也非常低,并且可能导致大输入上的堆栈溢出。

3 个答案:

答案 0 :(得分:1)

因为这是Java,我可以看到你通过脚本构建Regex,具体取决于源数据的复杂程度。在一个简单的层面上,正则表达式看起来像这样:

^(p(?!p)(.+?)(?<!n)n|pp(?!p)(.+?)(?<!n)nn|ppp(?!p)(.+?)(?<!n)nnn|pppp(?!p)(.+?)(?<!n)nnnn|ppppp(?!p)(.+?)(?<!n)nnnnn)$

enter image description here

这里正则表达式使用简单的替换来强制执行您要查找的字母数量,并且要求中间有一些非p和非n

您可以将(.+?)结构替换为(.*?),以仅允许前导p并且尾随n而不包含中心子字符串。

正则表达式解释

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    p                        'p'
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    (                        group and capture to \2:
--------------------------------------------------------------------------------
      .+?                      any character except \n (1 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \2
--------------------------------------------------------------------------------
    (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
      n                        'n'
--------------------------------------------------------------------------------
    )                        end of look-behind
--------------------------------------------------------------------------------
    n                        'n'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    pp                       'pp'
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    (                        group and capture to \3:
--------------------------------------------------------------------------------
      .+?                      any character except \n (1 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \3
--------------------------------------------------------------------------------
    (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
      n                        'n'
--------------------------------------------------------------------------------
    )                        end of look-behind
--------------------------------------------------------------------------------
    nn                       'nn'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    ppp                      'ppp'
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    (                        group and capture to \4:
--------------------------------------------------------------------------------
      .+?                      any character except \n (1 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \4
--------------------------------------------------------------------------------
    (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
      n                        'n'
--------------------------------------------------------------------------------
    )                        end of look-behind
--------------------------------------------------------------------------------
    nnn                      'nnn'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    pppp                     'pppp'
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    (                        group and capture to \5:
--------------------------------------------------------------------------------
      .+?                      any character except \n (1 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \5
--------------------------------------------------------------------------------
    (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
      n                        'n'
--------------------------------------------------------------------------------
    )                        end of look-behind
--------------------------------------------------------------------------------
    nnnn                     'nnnn'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    ppppp                    'ppppp'
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    (                        group and capture to \6:
--------------------------------------------------------------------------------
      .+?                      any character except \n (1 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \6
--------------------------------------------------------------------------------
    (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
      n                        'n'
--------------------------------------------------------------------------------
    )                        end of look-behind
--------------------------------------------------------------------------------
    nnnnn                    'nnnnn'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

答案 1 :(得分:0)

我宁愿使用更简单的方法解决它:

public static void main(String[] args) {
    final String input1 = "ppppjksdfjlnnsdkjfnnnn";
    final String input2 = "pppppjksdfppnjlsdkjfnnnnn";
    final String input3 = "ppppjksdfjnlsdkjfnnn";
    System.out.println(input1 + ": " + getAnswer(input1));
    System.out.println(input2 + ": " + getAnswer(input2));
    System.out.println(input3 + ": " + getAnswer(input3));

    }
public static int findAll(String pattern, String input) {
    Pattern p = Pattern.compile(pattern);
    Matcher m = p.matcher(input);
    int count = 0;
    while (m.find()) {
        count = m.group(0).length();
        break;
    }
    return count;
}   
public static boolean getAnswer(String input) {
    int number_p =  findAll("^p*", input);
    int number_n =  findAll("n*$", input);
    return number_p == number_n;
}

这是输出:

ppppjksdfjlnnsdkjfnnnn: true
pppppjksdfppnjlsdkjfnnnnn: true
ppppjksdfjnlsdkjfnnn: false

答案 2 :(得分:0)

您的问题是寻求在Java风格的Regex中支持的递归。

通常我会看到这个问题带有平衡的开括号和右括号。如果您使用的是PCRE风格,那么递归正则表达式将如下所示:

p((?>[^()]+)|(?R))*n