Question

我是Perl的初学者。有人可以帮助我如何从下面的脚本中正确提取数据吗？

#####################################################################
#! /usr/bin/perl
$text = "Name: Anne Lorrence Name: Burkart Name: Claire Name: Dan" ;
$match = 0 ;
while ($text =~ /Name: \b(\S+)\s+(\S+)\b/g || /Name: \b(\S+)\b/g) {
    ++ $match ;
print "Match number $match is $1 $2\n" ;
}
######################################################################

我希望我的输出是这样的：

Match number 1 is Anne MLorrence
Match number 2 is Burkart
Match number 3 is Claire 
Match number 4 is Dan

但实际上，我的脚本给了我这个：

Match number 1 is Anne MLorrence
Match number 2 is Burkart Name

我可以知道出了什么问题吗？

Answer 1

$text = "Name: Anne Lorrence Name: Burkart Name: Claire Name: Dan" ;
$match = 0 ;
while ($text =~ /Name: (.+?)(?= Name:|$)/g) {
    ++ $match ;
    print "Match number $match is $1\n" ;
}

它使用非贪婪捕获和零宽度正向前瞻来划分字段。

Match number 1 is Anne Lorrence
Match number 2 is Burkart
Match number 3 is Claire
Match number 4 is Dan

|$)部分是候补。一个更容易理解的例子是(ABC|DEF)，这意味着“匹配'ABC'或'DEF'”。 $只是行尾的符号。

perlre文档中解释了零宽度正向前瞻，但我将在此进行总结。它是一类称为“Look-Around Assertions”的模式的一部分，名称非常准确。想象一下正则表达式引擎在字符串中“环顾四周”。在这里使用的那个“向前看”字符串中的积极匹配。它被称为零宽度，因为它不会消耗模式匹配过程中的任何字符串。

因此，模式/Name: (.+?)(?= Name:|$)表示：

匹配“姓名：”
尽可能少地匹配和捕捉
直到您看到以下字符为“名称：”或EOL

可能有更好的方法来解决您的任务，但这很简短，让您深入了解正则表达式语言中较少使用的部分。环视非常有用，非常值得学习。

在Perl中查找字符串匹配项

1 个答案: