
时间:2019-09-23 20:19:20

标签: regex perl


  • 由以特殊字符(在下面的示例中为'%')开头的字母数字标签分隔
  • 标签文本以空格结尾
  • 该字段的内容以','结束
  • 字段内容永远不会包含



%a astuff,%b bstuff,%t this,%u that,%v this,%t that,%x其他,%xx仅一次,%q其他,%z其他,%c cstuff





def get_next_smallest(data,default=0):
        returns the discounted value for all items in a list
        discounted value is the next smaller item in the list, e.g.:
        for any n, the next smallest item is the first item in data[n+1:] < data[n]
        provides O(n) complexity solution.
    discounts=[default for i in data] # stores the corresponding next smaller value
    stack = [] # initialize our empty stack
    for i, this in enumerate(data):
        while len(stack) > 0 and this < data[stack[-1]]:
            discounts[stack.pop()] = this
    return discounts

def get_total(data):
    init_total = sum(data)
    default = 0  # should be a value that will NOT be present in the data, like 0 or -1
    discounts = get_next_smallest(data, default)
    full = [i for i,v in enumerate(discounts) if v == default]
    total = init_total - sum(discounts)
    return total, full


my $tagmrkr='%';
my $line='%a astuff,%b bstuff,%t this,%u that,%v this,%t that,%x the other,%xx only once,%q the other,%z the other,%c cstuff';

my $searchtags = qr/t|u|v|w|x|xx|y|z/; # excludes q

print qq/The line:$line\n\n/;
for ($line =~ m/
    $tagmrkr$searchtags\ ([^\,]*,)
    $tagmrkr$searchtags\ \1
    /gx) {
        print qq/First field contents:$1\n/;
        print qq/Entire match:$&\n/;
        print qq/\n/;


The line:%a astuff,%b bstuff,%t this,%u that,%v this,%t that,%x the other,%xx only once,%q the other,%z the other,%c cstuff

First field contents:this,
Entire match:%t this,%u that,%v this,

First field contents:the other,
Entire match:%x the other,%xx only once,%q the other,%z the other,

为什么将第一次匹配的The line:%a astuff,%b bstuff,%t this,%u that,%v this,%t that,%x the other,%xx only once,%q the other,%z the other,%c cstuff First field contents:the other, Entire match:%x the other,%xx only once,%q the other,%z the other, First field contents:the other, Entire match:%x the other,%xx only once,%q the other,%z the other, $1替换为第二次匹配的值?





2 个答案:

答案 0 :(得分:3)


use warnings;
use strict;
use feature 'say';

my $s = q(%a astuff,%b bstuff,%t this,%u that,%v this,%t that,)
      . q(%x the other,%xx only once,%q the other,%z the other,%c cstuff); 

my $m = qr/%/;
my $t = qr/(?:t|u|v|w|x|xx|y|z)/; 

while ($s =~ / $m$t \s ([^,]+) , (?=(.*?$m$t\s\g{1},?)) /gx) { 
    say "capture: $1";
    say "  whole: $1,$2";


capture: this
  whole: this,%u that,%v this,
capture: that
  whole: that,%v this,%t that,
capture: the other
  whole: the other,%xx only once,%q the other,%z the other,

答案 1 :(得分:0)


您可以通过为每次迭代重置pos $line来获得所有三个匹配项。例如。使用以下方法:

while ($line =~ m/
      $tagmrkr$searchtags\ ([^\,]*,)
      $tagmrkr$searchtags\ \1
   /gx) {
    pos $line = $-[0] + 1;
    print qq/First field contents:$1\n/;
    print qq/Entire match:$&\n/;
    print qq/\n/;


The line:%a astuff,%b bstuff,%t this,%u that,%v this,%t that,%x the other,%xx only once,%q the other,%z the other,%c cstuff

First field contents:this,
Entire match:%t this,%u that,%v this,

First field contents:that,
Entire match:%u that,%v this,%t that,

First field contents:the other,
Entire match:%x the other,%xx only once,%q the other,%z the other,