Question

我正在教自己perl因此我对这门语言很陌生。我一直在阅读关于正则表达式的一遍又一遍，但我无法弄清楚正确的背景。我想做以下事情：

假设我有一个文件名“testfile” 这个文件包含3行，

test this is the first line
test: this is the first line
test; this is the third line

如何在没有空格的;之后，只能阅读和打印第三个以及所有内容。所以基本上“这是第三行”

这就是我想要做的事情$string =~ m/this is the third/

编辑错误。在第一句和第二句中，测试前应该有一个空格。第三个句子不应该。所以我想跳过白色空间。

Answer 1

从STDIN抓起，它可能看起来像这样：

while ( <> ) { 
   print $1 if /^test; (.*\n)/;
}

Answer 2

如果你只想要第三行，那么只需计算行数然后执行：

s/.*;\s*//;

将删除所有内容，直到;以及之后的任何空白区域。但请注意，如果第三行包含另一行';'在它，然后你会遇到麻烦。所以，如果这是可能的，但没有机会提前存在，那么这样做：

s/[^;]*;\s*//;

只会在第一个';'之前删除（和尾随空格）。

但是，我怀疑，从长远来看，你想要匹配包含某种特定格式的所有行，并且它不会总是“只是第三个”。如果是这样的话：

while(<>) {
   if (/;\s*(.*)/) {
       print $1;
   }
}

让你更接近你的最终目标。

Answer 3

您可能会发现YAPE::Regex::Explain是一个方便的工具：

使用Axeman的正则表达式：

#!/usr/bin/env perl
use strict;
use warnings;
use YAPE::Regex::Explain;
my $expr = q(/^test; (.*\n)/);
print YAPE::Regex::Explain->new( $expr )->explain;

The regular expression:

(?-imsx:/^test; (.*\n)/)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  test;                    'test; '
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
    \n                       '\n' (newline)
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

Answer 4

实现此目的的另一种方法是尝试删除第一个;之前的所有内容（以及之后的任何空格），并且只有在要删除的内容时才打印该行。

s/.*?;\s*//;

这一行基本上说：“匹配任何字符（但尽可能少），然后是分号，然后是任何空格，并将所有字符全部替换为”。

然后，您可以创建一个从STDIN读取的程序：

while (<>) {
  print if s/.*?;\s*//;
}

您也可以在命令行上将它们变成一个漂亮的单行程序：

perl -ne 'print if s/.*?;\s*//;'

a之后读取一串字母数字字符;

4 个答案: