Question

我正在编辑Perl文件，但我不理解这个正则表达式比较。有人可以向我解释一下吗？

if ($lines =~ m/(.*?):(.*?)$/g) { } ..

这里发生了什么？ $lines是文本文件中的一行。

Answer 1

将其分解为部分：

$lines =~ m/ (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $1.
             :          # Match a colon.
             (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $2.
             $          # Match the end of the line.
           /gx;

因此，这将匹配":"之类的字符串（它匹配零个字符，然后是冒号，然后是行结束前的零个字符，$1和$2是空字符串），或"abc:"（$1 = "abc"，$2为空字符串）或"abc:def:ghi"（$1 = "abc"和$2 = "def:ghi"）。

如果传入一个不匹配的行（看起来如果该字符串不包含冒号），那么它将不会处理括号内的代码。但如果它匹配，则括号内的代码可以使用和处理特殊的$1和$2变量（至少，直到下一个正则表达式出现，如果括号内有一个）

Answer 2

有一个工具可以帮助理解正则表达式：YAPE::Regex::Explain。

忽略此处不需要的g修饰符：

use strict;
use warnings;
use YAPE::Regex::Explain;

my $re = qr/(.*?):(.*?)$/;
print YAPE::Regex::Explain->new($re)->explain();

__END__

The regular expression:

(?-imsx:(.*?):(.*?)$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

另见perldoc perlre。

Answer 3

它是由对正则表达式了解过多或对$'和$`变量不够了解的人编写的。

这可以写成

if ($lines =~ /:/) {
    ... # use $` ($PREMATCH)  instead of $1
    ... # use $' ($POSTMATCH) instead of $2
}

或

if ( ($var1,$var2) = split /:/, $lines, 2 and defined($var2) ) {
    ... # use $var1, $var2 instead of $1,$2
}

Answer 4

(.*?)捕获任何字符，但尽可能少捕获。

因此，它会查找<something>:<somethingelse><end of line>等模式，如果字符串中有多个:，则第一个将用作<something>和<somethingelse>之间的分隔符

Answer 5

该行表示在$lines上使用正则表达式m/(.*?):(.*?)$/g执行正则表达式匹配。如果在true和$lines中找不到匹配项，则会有效返回false。

对=~运算符的解释：

二进制“=〜”绑定标量表达式模式匹配。某些操作搜索或修改字符串$ _ by 默认。这种运算符就是这样的其他一些操作工作串。正确的论点是搜索模式，替代或音译。左边的参数是什么应该被搜索，替代或音译默认的$ _。在标量中使用时上下文，一般的返回值表明成功了操作

正则表达式本身是：

m/    #Perform a "match" operation
(.*?) #Match zero or more repetitions of any characters, but match as few as possible (ungreedy)
:     #Match a literal colon character
(.*?) #Match zero or more repetitions of any characters, but match as few as possible (ungreedy)
$     #Match the end of string
/g    #Perform the regex globally (find all occurrences in $line)

因此，如果$lines与该正则表达式匹配，它将进入条件部分，否则它将为false并将跳过它。

这个Perl正则表达式意味着什么：m /(.?):(.?)$/ g？

5 个答案:

这个Perl正则表达式意味着什么：m /(.*?):(.*?)$/ g？

5 个答案:

这个Perl正则表达式意味着什么：m /(.?):(.?)$/ g？