正则表达式:如何在下一个匹配模式(或文件结尾)之前捕获所有内容

时间:2016-11-01 09:30:58

标签: regex font-awesome pcre

我正在尝试使用正则表达式处理icons.yml(来自FontAwesome - 项目)。 (语言是“Dyalog APL”,它使用PCRE库。我在那里设置标志为“不区分大小写”和“点匹配换行符”。) 因此,通过以下输入:

  - name:       Glass
    id:         glass
    unicode:    f000
    created:    1.0
    categories:
      - Web Application Icons
      - Test1
      - Test2

  - name:       Music
    id:         music
    unicode:    f001
    created:    1.0
    categories:
      - Web Application Icons

  - name:       Search
    id:         search
    unicode:    f002
    created:    1.0
    categories:
      - Web Application Icons

我正在寻找一个rx,它会给我“name”,“id”,“unicode”,“created”和“最后”的内容categories“(在下一个” - 名称“开始或EOF之前我需要所有内容。)

已经成功地组合了一个返回前4个的表达式,但是“类别”失败了。不知怎的,这个“ EOF或者不是”-name“”给了我精神溢出; - )

.*-\sname:\s*([a-z\-]*)\s*id:\s*([a-z\-]*)\s*unicode:\s*([0-9a-f]{4})\s*created:\s*([0-9\.]*)\s*categories:\s*((?!-\sname:))

1 个答案:

答案 0 :(得分:2)

你可以试试这个:

name:(.*?)id:(.*?)unicode:(.*?)created:(.*?)categories:(.*?)(?=- name|$)

Explanation

Perl示例:

   #!/usr/bin/perl
# your code goes here

use strict;

my $str = '- name:      Glass
id:         glass
unicode:    f000
created:    1.0
categories:
  - Web Application Icons
  - Test1
  - Test2

- name:       Music
id:         music
unicode:    f001
created:    1.0
categories:
  - Web Application Icons

- name:       Search
id:         search
unicode:    f002
created:    1.0
categories:
  - Web Application Icons1
';
my $regex = qr/name:(.*?)id:(.*?)unicode:(.*?)created:(.*?)categories:(.*?)(?=- name|$)/sp;

while ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH}\n";

}

Run the code here