我的正则表达式不匹配,我无法确定原因

时间:2014-10-09 19:10:20

标签: regex

以下是我试图在标量中匹配的文本示例:

1 N [51]Gone Girl [52]Fox $37,513,109 - 3,014 - $12,446 $37,513,109 $61   1
2 N [53]Annabelle [54]WB (NL) $37,134,255 - 3,185 - $11,659 $37,134,255   $6.5 1
3 1 [55]The Equalizer [56]Sony $18,750,375 -45.1% 3,236 - $5,794   $64,236,992 $55 2
4 3 [57]The Boxtrolls [58]Focus $11,979,588 -30.7% 3,464 - $3,458   $32,093,796 $60 2
5 2 [59]The Maze Runner [60]Fox $11,634,764 -33.3% 3,605 -33 $3,227   $73,556,159 $34 3
6 N [61]Left Behind (2014) [62]Free $6,300,147 - 1,825 - $3,452   $6,300,147 $16 1 
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133   $1,466 $29,012,573 $19.8 3
8 5 [65]Dolphin Tale 2 [66]WB $3,422,377 -28.5% 2,790 -586 $1,227   $37,866,130 $36 4

这是我正在使用的正则表达式似乎不匹配。任何人都可以找出原因吗?

if ($allData =~ /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+)\s+(\[\d+\])(.+)\s+(\$\.+)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+(\d+)\s+(\-\d+|\-|\+\d+)\s+(\$\.+)\s+(\$\.+)\s+(\.+)\s+(\d+)/g)
{

$current[$i] = $1;
$last[$i] = $2;
$title[$i] = $4;
$week[$i] = $7;
$cume[$i] = $12;

printf("%-4s%-4s%-35s%-10s%-10s", $current[$i], $last[$i], $title[$i], $week[$i], $cume[$i]);

if ($last[$i] ne '-'){
    $gain = $last[$i] - $current[$i];
}

if ($gain < $bigloss){
    $bigloss = $gain;
    $losstitle = $title[$i];
}

if ($gain > $biggain){
    $biggain = $gain;
    $gaintitle = $title[$i];
}

if ($last[$i] eq '-'){

    if ($current[$i] < $bigdebut){
        $bigdebut = $current[$i];
        $bigdebuttitle = $title[$i];
    }

    if ($current[$i] > $weakdebut){
        $weakdebut = $current[$i];
        $weakdebuttitle = $title[$i];
    }
}
$i++;
}

2 个答案:

答案 0 :(得分:0)

试试这个正则表达式:

\d\s[A-Z0-9]\s\[\d\d\][A-Z][a-z]+(\s\b\w+\b){0,}\s(\(\d+\)\s)?\[\d\d\][A-Z]+[a-z]*\s(\(\w+\)\s)?\$(\d{1,3},){2}\d{3}\s-\s?\d+[,.]\d+((%\s\d,\d{1,3}\s-\s?\$?\d{1,3}(,\d{1,3}\s)?)|\s-\s\$\d{1,3},\d{1,3}\s)\s?\$\d{1,3},\d{1,3}(,\d{1,3})*\s\$\d{1,3}(,\d{1,3})*(\.\d+)?(\s\$\d+(\.)?\d+)?\s\d

此处:http://regexr.com/39m54

答案 1 :(得分:0)

可能是修复 -

 # /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+?)\s+(\[\d+\])(.+?)\s+(\$.+?)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+([\d,]+)\s+(\-\d+|\-|\+\d+)\s+(\$.+?)\s+(\$.+?)\s+(.+?)\s+(\d+)/g

 ( \d+ )                            # (1)
 \s+ 
 ( \d+ | [N] )                      # (2)
 \s+ 
 ( \[ \d+ \] )                      # (3)
 ( .+? )                            # (4)
 \s+ 
 ( \[ \d+ \] )                      # (5)
 ( .+? )                            # (6)
 \s+ 
 ( \$ .+? )                         # (7)
 \s+ 
 (                                  # (8 start)
      \-
   |  \+ \d+ \. \d+ %
   |  \- \d+ \. \d+ % 
 )                                  # (8 end)
 \s+ 
 ( [\d,]+ )                         # (9)
 \s+ 
 ( \- \d+ | \- | \+ \d+ )           # (10)
 \s+ 
 ( \$ .+? )                         # (11)
 \s+ 
 ( \$ .+? )                         # (12)
 \s+ 
 ( .+? )                            # (13)
 \s+ 
 ( \d+ )                            # (14)

输出样本:

 **  Grp 0 -  ( pos 506 , len 98 ) 
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133   $1,466 $29,012,573 $19.8 3  
 **  Grp 1 -  ( pos 506 , len 1 ) 
7  
 **  Grp 2 -  ( pos 508 , len 1 ) 
4  
 **  Grp 3 -  ( pos 510 , len 4 ) 
[63]  
 **  Grp 4 -  ( pos 514 , len 25 ) 
This is Where I Leave You  
 **  Grp 5 -  ( pos 540 , len 4 ) 
[64]  
 **  Grp 6 -  ( pos 544 , len 2 ) 
WB  
 **  Grp 7 -  ( pos 547 , len 10 ) 
$4,009,345  
 **  Grp 8 -  ( pos 558 , len 6 ) 
-41.8%  
 **  Grp 9 -  ( pos 565 , len 5 ) 
2,735  
 **  Grp 10 -  ( pos 571 , len 4 ) 
-133  
 **  Grp 11 -  ( pos 578 , len 6 ) 
$1,466  
 **  Grp 12 -  ( pos 585 , len 11 ) 
$29,012,573  
 **  Grp 13 -  ( pos 597 , len 5 ) 
$19.8  
 **  Grp 14 -  ( pos 603 , len 1 ) 
3