perl正则表达式方括号和单引号

时间:2016-09-23 12:28:54

标签: regex perl brackets

有这个字符串:

ABC,-0.5,10Y,10Y,['TEST'],ABC.1000145721ABC,-0.5,20Y,10Y,['TEST'],ABC.1000145722

重复数据。

我需要从数据中删除[]'字符,所以它看起来像这样:

ABC,-0.5,10Y,10Y,TEST,ABC.1000145721ABC,-0.5,20Y,10Y,TEST,ABC.1000145722

我也试图拆分数据以将其分配给变量,如下所示:

my($currency, $strike, $tenor, $tenor2,$ado_symbol) = split /,/, $_;

这适用于['TEST']部分以外的所有内容。我应该首先删除[]'字符然后保持我的分割相同,还是有更简单的方法来执行此操作?

由于

5 个答案:

答案 0 :(得分:3)

有用的东西是这样的 - split采用正则表达式。 (它甚至会让你捕获,但是这会插入到返回的列表中,这就是我为非捕获组获得(?:的原因)

我发现你的数据只有['在分隔符旁边 - 所以如何:

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

while ( <DATA> ) {
  chomp;
  my @fields = split /(?:\'])?,(?:\[\')?/; 
  print Dumper \@fields;
}

__DATA__
ABC,-0.5,10Y,10Y,['TEST'],ABC.1000145721ABC,-0.5,20Y,10Y,['TEST'],ABC.1000145722

输出:

$VAR1 = [
          'ABC',
          '-0.5',
          '10Y',
          '10Y',
          'TEST',
          'ABC.1000145721ABC',
          '-0.5',
          '20Y',
          '10Y',
          'TEST',
          'ABC.1000145722'
        ];

答案 1 :(得分:1)

my $str = "ABC,-0.5,10Y,10Y,['TEST'],ABC.1000145721ABC,-0.5,20Y,10Y,['TEST'],ABC.1000145722";

$str =~ s/\['|'\]//g;

print $str;

输出

ABC,-0.5,10Y,10Y,TEST,ABC.1000145721ABC,-0.5,20Y,10Y,TEST,ABC.1000145722

现在你可以拆分了。

答案 2 :(得分:0)

分割后清理$ado_symbol

$ado_symbol =~ s/^\['//;
$ado_symbol =~ s/'\]$//;

答案 3 :(得分:0)

您可以使用全局正则表达式匹配来查找所有不是逗号,单引号或方括号的子字符串

喜欢这个

use strict;
use warnings 'all';

my $s = q{ABC,-0.5,10Y,10Y,['TEST'],ABC.1000145721ABC,-0.5,20Y,10Y,['TEST'],ABC.1000145722};

my @data = $s =~ /[^,'\[\]]+/g;

my ( $currency, $strike, $tenor, $tenor2, $ado_symbol ) = @data;

print "\$currency   = $currency\n";
print "\$strike     = $strike\n";
print "\$tenor      = $tenor\n";
print "\$tenor2     = $tenor2\n";
print "\$ado_symbol = $ado_symbol\n";

输出

$currency   = ABC
$strike     = -0.5
$tenor      = 10Y
$tenor2     = 10Y
$ado_symbol = TEST

答案 4 :(得分:0)

另一种选择

my $str = "ABC,-0.5,10Y,10Y,['TEST'],ABC.1000145721ABC,-0.5,20Y,10Y,['TEST'],ABC.1000145722";

my ($currency, $strike, $tenor, $tenor2,$ado_symbol) = map{ s/[^A-Z0-9\.-]//g; $_} split ',',$str;
print "$currency, $strike, $tenor, $tenor2, $ado_symbol",$/;

输出是:

ABC, -0.5, 10Y, 10Y, TEST