我需要一些正则表达式的帮助。请参阅下面的示例。我正在捕获这个
之间包含的特定rid值","children":[
以此结尾
}]}]}
如下图所示。
我的问题是,下面显示的块会重复几次,我希望所有rid只在每个块的","children":[ to }]}]}
开头之间。
我知道我可以用rid":"([\w\d\-\."]+)
但我不知道如何指定捕获rid":"([\w\d\-\."]+)
到","children":[
的开头之间存在的所有}]}]}
示例:
","children":[{"type":"stub","context":"","rid":"b1c4922237ce.ee6a3644443fe.10711226e93.d0af7aadbd0-4be3-4353ddd.8b47.f2f4aaf2474f","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.290c6e93.91c15f91-a1c-4c36.9939.4ab7b94a39ad","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.27c3ee93.22e90c22-7406-463a.8bff.f6ea88f6ffcc","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6a182e93.5c0e7d5c-ff65-451d.afc0.cfc7fbcfc02d","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6970ae93.8ea3978e-112b-4bbb.8405.d17071d105d2","metaclass":"ASAPModel.BarrierCategory"}]}]}, ","children":[{"type":"stub","context":"","rid":"b1c4922237ce.ee6a3644443fe.10711226e93.d0af7aadbd0-4be3-4353ddd.8b47.f2f4aaf2474f","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.290c6e93.91c15f91-a1c-4c36.9939.4ab7b94a39ad","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.27c3ee93.22e90c22-7406-463a.8bff.f6ea88f6ffcc","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6a182e93.5c0e7d5c-ff65-451d.afc0.cfc7fbcfc02d","metaclass":"ASAPModel.BarrierCategory"}, {"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6970ae93.8ea3978e-112b-4bbb.8405.d17071d105d2","metaclass":"ASAPModel.BarrierCategory"}]}]},
我的问题是我不明白如何指定启动非捕获组的起点和终点值以及如何识别这些捕获组中的一个或多个类似[]+
答案 0 :(得分:6)
这看起来像JSON(尽管您的示例数据不完整无效)。
如果是这样,那么来自JSON的CPAN模块可能是最好的前进方式:
use strict;
use warnings;
use JSON qw( from_json );
# my example data
my $data = q( [
{"children":[ {"type":"stub","rid":"aa"}, {"type":"stub2","rid":"bb"} ] },
{"children":[ {"type":"stub","rid":"cc"}, {"type":"stub2","rid":"dd"} ] } ]
);
my $json = from_json( $data );
for my $rec ( @$json ) {
for my $child ( @{ $rec->{children} } ) {
say "rid: ", $child->{rid};
}
}
打印:
rid: aa rid: bb rid: cc rid: dd
答案 1 :(得分:1)
您需要将其分解为两个步骤:
获取rids
# Make sure you get the first one
my ( $child ) = $record =~ m/"children":\[([^\]]+)\]/g;
# Get all in span - the g operator tells the regex to get all ( 'global' )
my @rids = $child =~ m/"rid":"([^"]+)"/g; # <-- g operator
但它看起来像JSON,你可以用JSON::Syck解析这样的数据
答案 2 :(得分:0)
像\",\"children\":(.*)(?=\\]\\}\\]\\})
玩弄它
论坛正在吸收我的一些反斜杠,警告要加倍为其他人
以回应编辑
首先尝试将数据分解到括号内的组中,然后在for循环中对每个组进行一次搜索。您可以使用正则表达式组一次性获取所有组。