我正在尝试从以下文本文件中提取特定记录。我只能从文件中选择特定的记录。
输入文件:
Record 0:
[record
InputData [record
RecType "001"
MyData [record
RefTable "001"
RefTableDesc "Metadata "]
MyAdd NULL
MyType NULL
MyRole NULL]]
Record 1:
[record
InputData [record
RecType "001"
MyData [record
RefTable "002"
RefTableDesc "Metadata "]
MyAdd NULL
MyType NULL
MyRole NULL]]
Record 2:
[record
InputData [record
RecType "002"
MyData NULL
MAdd [record
MY_ADD_CD "00 "
MY_ADD_SHORT_NM "MY Specific"
MY_ADD_NM "My Specific Addendum"
MY_ADD_TYPE_CD "01 "]
MyType NULL
MyRole NULL]]
Record 3:
[record
InputData [record
RecType "002"
MyData NULL
MAdd [record
MY_ADD_CD "001"
MY_ADD_SHORT_NM "MY Specific"
MY_ADD_NM "My Specific Addendum"
MY_ADD_TYPE_CD "01 "]
MyType NULL
MyRole NULL]]
这是我的perl脚本:
#!/usr/bin/perl
use strict;
use warnings;
my $fn = shift || 'dump.txt';
my $word1 = shift || 'RecType';
my $word2 = shift || 'RefTable';
my $word3 = shift || 'RefTableDesc';
my $word4 = shift || 'MY_ADD_CD';
my $word5 = shift || 'MY_ADD_SHORT_NM';
my $word6 = shift || 'MY_ADD_NM';
my $word7 = shift || 'MY_ADD_TYPE_CD';
my @output;
open my $fh, '<', $fn or die "Could not open file '$fn': $!";
while (<$fh>) {
if ($. = /\b$word1\b/i) {
push @output, split;
}
elsif ($. = /\b$word2\b/i ){
push @output, split;
}
elsif ($. = /\b$word3\b/i ){
push @output, split;
}
elsif ($. = /\b$word4\b/i) {
push @output, split;
}
elsif ($. = /\b$word5\b/i ){
push @output, split;
}
elsif ($. = /\b$word6\b/i ){
push @output, split;
}
elsif ($. = /\b$word7\b/i ){
push @output, split;
print "@output\n";
@output = ();
}
}
close ($fh);
以下是我得到的输出:
RecType "001" RefTable "001" RefTableDesc "Metadata " RecType "001" RefTable "002" RefTableDesc "Metadata " RecType "002" MY_ADD_CD "00 " MY_ADD_SHORT_NM "MY Specific" MY_ADD_NM "My Specific Addendum " MY_ADD_TYPE_CD "01 "
RecType "002" MY_ADD_CD "001" MY_ADD_SHORT_NM "MY Specific" MY_ADD_NM "My Specific Addendum " MY_ADD_TYPE_CD "01 "
期望的输出:
"001" "001" "Metadata "
"001" "002" "Metadata "
"002" "00 " "MY Specific" "My Specific Addendum " "01 "
"002" "001" "MY Specific" "My Specific Addendum " "01 "
请建议是否有办法实现它。
答案 0 :(得分:1)
以下是可用于生成输出的记录的解析器:
#!/usr/bin/perl
use strict;
use warnings;
my $fn = shift || 'dump.txt';
open my $fh, '<', $fn or die "Could not open file '$fn': $!";
sub read_record {
my %record;
my $end;
while (<$fh>) {
chomp;
(my $key, my $value,$end) = /\s*(\w+)\s+([^\]]*)(\]*)\s*$/;
$end = length($end);
if ( $value && $value =~ /\[record/ ) {
($record{$key}, $end) = read_record();
} elsif ( $value =~ /"(.*?)\s*"/ ) {
$record{$key} = $1;
} elsif ( $value =~ /NULL/ ) {
$record{$key} = undef;
}
last if $end;
}
return wantarray ? (\%record, --$end) : \%record;
}
my @records;
while (<$fh>) {
if ( /^Record (\d+):/ ) {
<$fh>; # toss the [record line
$records[$1] = read_record();
}
}
close ($fh);
use Data::Dumper;
print Dumper \@records;
输出:
$VAR1 = [
{
'InputData' => {
'MyAdd' => undef,
'MyType' => undef,
'MyRole' => undef,
'MyData' => {
'RefTable' => '001',
'RefTableDesc' => 'Metadata'
},
'RecType' => '001'
}
},
{
'InputData' => {
'MyData' => {
'RefTable' => '002',
'RefTableDesc' => 'Metadata'
},
'RecType' => '001',
'MyAdd' => undef,
'MyType' => undef,
'MyRole' => undef
}
},
{
'InputData' => {
'RecType' => '002',
'MyData' => undef,
'MyRole' => undef,
'MyType' => undef,
'MAdd' => {
'MY_ADD_SHORT_NM' => 'MY Specific',
'MY_ADD_TYPE_CD' => '01',
'MY_ADD_CD' => '00',
'MY_ADD_NM' => 'My Specific Addendum'
}
}
},
{
'InputData' => {
'MyData' => undef,
'RecType' => '002',
'MyRole' => undef,
'MyType' => undef,
'MAdd' => {
'MY_ADD_NM' => 'My Specific Addendum',
'MY_ADD_CD' => '001',
'MY_ADD_TYPE_CD' => '01',
'MY_ADD_SHORT_NM' => 'MY Specific'
}
}
}
];
但是,如果您只想要输出并且不关心记录,那么问题就更简单了:
#!/usr/bin/perl
use strict;
use warnings;
my $fn = shift || 'dump.txt';
open my $fh, '<', $fn or die "Could not open file '$fn': $!";
while (<$fh>) {
print "$1 " if /("[^"]*")/;
print "\n" if /\]\]/;
}
close ($fh);
输出:
"001" "001" "Metadata "
"001" "002" "Metadata "
"002" "00 " "MY Specific" "My Specific Addendum" "01 "
"002" "001" "MY Specific" "My Specific Addendum" "01 "
答案 1 :(得分:0)
哦,伙计。 $.
是文件中的当前行号。试试这个:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my $fname = shift || 'dump.txt';
open my $INFILE, '<', $fname
or die "Could not open file '$fname': $!";
while (my $line = <$INFILE>) {
say $.;
}
--output:--
1
2
3
...
...
45
46
来自perlvar:
您可以通过分配$来调整计数器。 ,但这不会 实际上移动了搜索指针。
这是什么意思?究竟?我们试一试:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my $fname = shift || 'dump.txt';
open my $INFILE, '<', $fname
or die "Could not open file '$fname': $!";
while (my $line = <$INFILE>) {
say $.;
if ($. == 1) {
$. = 10;
}
}
--output:--
1
11
12
13
...
...
54
55
因此,分配到$.
只会更改$.
计算的数字。
在您的代码中,您有一系列if / elsif语句,如下所示:
if ($. = /\b$word1\b/i) {
在scalar context
中,当您为标量变量(即名称以符号$开头的变量)分配内容时创建的上下文,match operator
返回0
没有匹配,如果匹配则为1
。
因此,您的if语句有时会将0分配给$.
:
if ($. = 0) {
有时你的if语句将1分配给$.
:
if ($. = 1) {
这一切都很好,除非你在分配之后从不使用$.
的值,因此它是一个无用的任务。您只需重复为$.
分配新值,因为if / else分支执行。
由于您的代码不依赖于您分配给$.
的值,因此您应将其删除:
if (/\b$word1\b/i)
接下来,if条件被认为是boolean context
,即真/假上下文,布尔上下文是标量上下文(您只需要记住它)。所以现在你知道了:if条件是一个标量上下文。如上所述,标量上下文中的匹配运算符在匹配时返回0,如果没有匹配则返回1。结果,if语句:
if (/\b$word1\b/i)
......相当于:
if( 0 ) #when there is no match
...或:
if ( 1 ) #when there is a match
最后,在布尔上下文中,0被认为是假,1被认为是真。因此,当匹配时,执行if / else块;如果没有匹配,则跳过if / else块。
世界上有什么人将你的价值分配给$.
? perl有很多全局变量,你是如何选择$.
的?而且,我想知道你为什么不写:
my $x;
if ($x = /\b$word1\b/i)
分配给$ x与分配给$.
一样无用,但至少你并没有弄乱perl的全局变量。
下一个问题是:你的代码将所有数据转储到一个数组中,这意味着你不知道一个匹配的数据在哪里结束,另一个匹配的数据从哪里开始。