Question

请你解释一下，为什么Perl正则表达式

$text = 150 45.5 678,68767 15.10.14;
$text =~ m/[0-9]+[.,]?[0-9]+[^.]/;

捕获
150
45.5
678,68767
15.10
14 ？我想排除15.10.14，这就是我添加 [^。] 的原因，但它的效果并不像我预期的那样...... 我猜这个表达式被解释为：
[0-9] + =＆gt; 15.1
[^。] =＆gt; 0
但我不知道如何重写它只能得到一个。或，的数字，并用两个。排除日期。请你帮助我好吗？非常感谢。

Answer 1

由于[^.]

，由于回溯，您的正则表达式与15.10匹配

它与15.10.14中的15.10匹配的原因是：

[0-9]+[.,]?[0-9]+[^.]
  ^Matches the 15

[0-9]+[.,]?[0-9]+[^.]
        ^Matches the .

[0-9]+[.,]?[0-9]+[^.]
             ^Matches the 10

[0-9]+[.,]?[0-9]+[^.]
                   ^ Causes the backtracking because of the . at position .14

Backtracking switches to see 15.

[0-9]+[.,]?[0-9]+[^.]
             ^ Now matches the 1

[0-9]+[.,]?[0-9]+[^.]
                  ^ Now matches the 0

匹配发现！

您可以使用原子组：

(?>[0-9]+[.,]?[0-9]+)[^.]

Answer 2

我认为您正在尝试匹配有效数字。

(?<!\S)\d+(?:,\d+)?(?:\.\d+)?(?!\S)

DEMO

EXPLANATION

OR

(?<!\S)(?!\d+\.\d+\.)[0-9]+[.,]?[0-9]+(?!\S)

DEMO

Answer 3

您对比赛的解释是正确的。对于正则表达式，必须尽可能找到匹配项，因此正则表达式引擎将从[0-9]+中回复字符以尝试匹配[^.]。< / p>

use strict; 
use warnings; 
use 5.016;
use Data::Dumper;

my $text = '150 45.5 678,68767 15.10.14';

my @words = split " ", $text;

for my $word (@words) {
    my $dot_count = () = $word =~ /[.]/gxms;  # `goatse` operator

    if($dot_count < 2) {
        say $word;
    }
}

--output:--
150
45.5
678,68767

或者，这也有效：

while ($text =~ /([^ ]+)/gxms) { 
    my $word = $1;
    my $dot_count = () = $word =~ /[.]/gxms;

    if($dot_count < 2) {
        say $word;
    }
}

正则表达式：排除带有两个点的数字

3 个答案: