以下是我试图在标量中匹配的文本示例:
1 N [51]Gone Girl [52]Fox $37,513,109 - 3,014 - $12,446 $37,513,109 $61 1
2 N [53]Annabelle [54]WB (NL) $37,134,255 - 3,185 - $11,659 $37,134,255 $6.5 1
3 1 [55]The Equalizer [56]Sony $18,750,375 -45.1% 3,236 - $5,794 $64,236,992 $55 2
4 3 [57]The Boxtrolls [58]Focus $11,979,588 -30.7% 3,464 - $3,458 $32,093,796 $60 2
5 2 [59]The Maze Runner [60]Fox $11,634,764 -33.3% 3,605 -33 $3,227 $73,556,159 $34 3
6 N [61]Left Behind (2014) [62]Free $6,300,147 - 1,825 - $3,452 $6,300,147 $16 1
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
8 5 [65]Dolphin Tale 2 [66]WB $3,422,377 -28.5% 2,790 -586 $1,227 $37,866,130 $36 4
这是我正在使用的正则表达式似乎不匹配。任何人都可以找出原因吗?
if ($allData =~ /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+)\s+(\[\d+\])(.+)\s+(\$\.+)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+(\d+)\s+(\-\d+|\-|\+\d+)\s+(\$\.+)\s+(\$\.+)\s+(\.+)\s+(\d+)/g)
{
$current[$i] = $1;
$last[$i] = $2;
$title[$i] = $4;
$week[$i] = $7;
$cume[$i] = $12;
printf("%-4s%-4s%-35s%-10s%-10s", $current[$i], $last[$i], $title[$i], $week[$i], $cume[$i]);
if ($last[$i] ne '-'){
$gain = $last[$i] - $current[$i];
}
if ($gain < $bigloss){
$bigloss = $gain;
$losstitle = $title[$i];
}
if ($gain > $biggain){
$biggain = $gain;
$gaintitle = $title[$i];
}
if ($last[$i] eq '-'){
if ($current[$i] < $bigdebut){
$bigdebut = $current[$i];
$bigdebuttitle = $title[$i];
}
if ($current[$i] > $weakdebut){
$weakdebut = $current[$i];
$weakdebuttitle = $title[$i];
}
}
$i++;
}
答案 0 :(得分:0)
试试这个正则表达式:
\d\s[A-Z0-9]\s\[\d\d\][A-Z][a-z]+(\s\b\w+\b){0,}\s(\(\d+\)\s)?\[\d\d\][A-Z]+[a-z]*\s(\(\w+\)\s)?\$(\d{1,3},){2}\d{3}\s-\s?\d+[,.]\d+((%\s\d,\d{1,3}\s-\s?\$?\d{1,3}(,\d{1,3}\s)?)|\s-\s\$\d{1,3},\d{1,3}\s)\s?\$\d{1,3},\d{1,3}(,\d{1,3})*\s\$\d{1,3}(,\d{1,3})*(\.\d+)?(\s\$\d+(\.)?\d+)?\s\d
答案 1 :(得分:0)
可能是修复 -
# /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+?)\s+(\[\d+\])(.+?)\s+(\$.+?)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+([\d,]+)\s+(\-\d+|\-|\+\d+)\s+(\$.+?)\s+(\$.+?)\s+(.+?)\s+(\d+)/g
( \d+ ) # (1)
\s+
( \d+ | [N] ) # (2)
\s+
( \[ \d+ \] ) # (3)
( .+? ) # (4)
\s+
( \[ \d+ \] ) # (5)
( .+? ) # (6)
\s+
( \$ .+? ) # (7)
\s+
( # (8 start)
\-
| \+ \d+ \. \d+ %
| \- \d+ \. \d+ %
) # (8 end)
\s+
( [\d,]+ ) # (9)
\s+
( \- \d+ | \- | \+ \d+ ) # (10)
\s+
( \$ .+? ) # (11)
\s+
( \$ .+? ) # (12)
\s+
( .+? ) # (13)
\s+
( \d+ ) # (14)
输出样本:
** Grp 0 - ( pos 506 , len 98 )
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
** Grp 1 - ( pos 506 , len 1 )
7
** Grp 2 - ( pos 508 , len 1 )
4
** Grp 3 - ( pos 510 , len 4 )
[63]
** Grp 4 - ( pos 514 , len 25 )
This is Where I Leave You
** Grp 5 - ( pos 540 , len 4 )
[64]
** Grp 6 - ( pos 544 , len 2 )
WB
** Grp 7 - ( pos 547 , len 10 )
$4,009,345
** Grp 8 - ( pos 558 , len 6 )
-41.8%
** Grp 9 - ( pos 565 , len 5 )
2,735
** Grp 10 - ( pos 571 , len 4 )
-133
** Grp 11 - ( pos 578 , len 6 )
$1,466
** Grp 12 - ( pos 585 , len 11 )
$29,012,573
** Grp 13 - ( pos 597 , len 5 )
$19.8
** Grp 14 - ( pos 603 , len 1 )
3