RegEx用于捕获Perl中的vcard组

时间:2019-05-15 18:17:29

标签: regex regex-lookarounds regex-group vala regex-greedy

这个学期我在大学里学习语法和语义,而正则表达式经常在其中发挥作用。作为锻炼的一种方式,我发现了可以应用正则表达式的不同方案。考虑到VCard就是其中之一,我一直无法指定一些东西来将BEGIN:VCARDEND:VCARD

之间的所有内容分组

请注意,.vcf文件使用行分隔

为此,我最好的模式如下:(尽管我尝试了许多变体

BEGIN:VCARD\n([^(END:VCARD)\n]*END:VCARD

所以想法是:“从开始vcard读取所有不是END:VCARD的内容,并且以换行符结束,直到遇到结束vcard为止”

我正在使用perl变体,但是使用了vala编程语言。

我意识到问题出在我的模式上,但是经过长时间的阅读和反复试验后,我仍然不确定测试人员为什么认为它不起作用。

测试数据:

BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD

在我最成功的测试中,它标记了从第一个BEGIN:VCARDEND:VCARD之前一行的所有内容

1 个答案:

答案 0 :(得分:3)

此表达式可以帮助您做到这一点:

(BEGIN:VCARD([\s\S]*?)END:VCARD)

Perl测试:

use strict;

my $str = 'BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\\:ABPerson
END:VCARD';
my $regex = qr/(BEGIN:VCARD([\s\S]*?)END:VCARD)/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  # print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}

RegEx

如果这不是您想要的表达式,则可以在https://docs.docker.com/engine/reference/commandline/network_create/中修改/更改表达式。

RegEx电路

您还可以在regex101.com中可视化您的表达式:

jex.im