Question

#!/usr/bin/perl
@lines = `perldoc -u -f atan2`;
foreach (@lines) {
  s/\w<([^>]+)>/\U$1/g;
  print;
}

表达式s/\w<([^>]+)>/\U$1/g;将如何运作？

Answer 1

替换是这样的：

s/             
    \w<         # look for a single alphanumeric character followed by <
    ([^>]+)     # capture one or more characters that are not <
    >           # followed by a >
/               ### replace with
   \U           # change following text to uppercase
   $1           # the captured string from above
/gx             # /g means do this as many times as possible per line

我添加了/x修饰符，以便能够显示正则表达式。字符类[^>]被否定，由^后面的[字符表示，表示“>以外的任何字符”。

例如，在perldoc命令的输出中

X<atan2> X<arctangent> X<tan> X<tangent>

更改为

ATAN2 ARCTANGENT TAN TANGENT

Answer 2

这是另一个选择，以弄清楚它在做什么。使用CPAN中的模块YAPE::Regex::Explain。

以这种方式使用它（这只是搜索和替换的匹配部分）：

use strict;
use YAPE::Regex::Explain;

print YAPE::Regex::Explain->new(qr/\w<([^>]+)>/)->explain();

将提供此输出：

The regular expression:

(?-imsx:\w<([^>]+)>)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \w                       word characters (a-z, A-Z, 0-9, _)
----------------------------------------------------------------------
  <                        '<'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^>]+                    any character except: '>' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  >                        '>'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

表达式的替换部分表示先前在“group and capture to \ 1”和“end of \ 1”之间进行的匹配应该转换为大写。

Answer 3

perl循环如下所示：

foreach $item (@array)
{
   # Code in here. ($item takes a new value from array each iteration)
}

但perl允许你几乎无处不在执行此操作时，将使用特殊变量$_。

所以在你的情况下：

foreach (@lines) 
{
}

完全相同：

foreach $_ (@lines) 
{
}

现在在体内有以下代码：

s/\w<([^>]+)>/\U$1/g;

发生了同样的事情。你实际上是在研究一个变量。当你没有指定变量perl时，默认为$_。

因此它相当于：

$_ =~ s/\w<([^>]+)>/\U$1/g;

结合两者：

foreach (@lines) {
  s/\w<([^>]+)>/\U$1/g;
  print;
}

也是等同的：

foreach $item (@lines)
{
    $item =~ s/\w<([^>]+)>/\U$1/g;
    print $item;
}

为了便于阅读，我使用$item。在内部，它意味着$_。

很多perl代码使用这种类型的快捷方式。就个人而言，我认为这使得阅读更加困难（即使是经验丰富的perl程序员（这也是perl因不可读而闻名的原因之一））。因此，我总是尝试并明确使用变量（但这（我的用法）不是典型的perl用法）。

For循环如何在perl中工作

3 个答案: