创建下面哈希哈希的方法是我的文件的内容,当我用下面的内容打开我的输入文件时,我们想要从表中grep一些数据,这样我想创建一个下面的模式哈希。
{
'Golden' => {
'PT' => 80,
'po' => 43,
'DFF' => 139145,
'DLAT' => 276,
'Z' => 4,
'BBOX' => 833,
'Total' => 140387
},
'Foo Bar' => {
'PT' => 80,
'po' => 43,
'DFF' => 139145,
'DLAT' => 276,
'Z' => 4,
'BBOX' => 833,
'Total' => 140387
}
};
文件内容
样本1
Mapped points: SYSTEM class
Mapped points PI PO BBOX Total
Golden 86 43 833 140387
Revised 86 43 833 140387
样本2
Mapped points: SYSTEM class
Mapped points PI PO DFF DLAT Z BBOX Total
Golden 86 43 139145 276 4 833 140387
Revised 86 43 139145 276 4 833 140387
以及她被困的地方。
if ($line =~ m/^Mapped points: SYSTEM class$/){
$var = "golden_mapped-points_PI";
$hash{$var} = $6;
next unless $line = <$filehandle> and $line =~ /^Mapped points PI PO DFF DLAT Z BBOX Total $/;
}
并且在不同的输入文件编号列中可能会发生变化,例如在这种情况下它可能只有PI和PO我也没有得到如何处理输入文件。
答案 0 :(得分:1)
您需要保留标题行的副本,因为标题是内部哈希条目的键。一旦开始处理数据行,您需要将第一个项目作为外部哈希条目的键移开 - 这是该行的一种标签。一旦将其存储起来,您的标题和数据就会排好&#34;;
#!/usr/bin/env perl
use v5.12 ;
use Data::Dumper ;
my $starting_re = 'Mapped points: SYSTEM class' ;
my $headings_re = 'Mapped \s+ points \s+' ;
my %outer_hash;
# Find the starting line
while (<>) { last if /$starting_re/ }
# Read in one line. Expect it to be the headings line
# Split it into pieces and store them for latter
my @headings ;
my $headings_line = <> ;
if ($headings_line =~ / $headings_re (.*) /x) {
@headings = split ' ', $1 ;
}
else {
say "Dont understand the headings line.";
exit 1;
}
# Process each remaining line
while (<>) {
# Split the data line and shift off
# the first piece to use as a line label
my @data = split ' ', $_;
my $label = shift @data ;
# Initialize a new inner hash and an index variable
my %inner_hash = ();
my $index = 0;
# Iterate over the data pieces building up the hash
for (@data) {
$inner_hash{ $headings[$index] } = $_ ;
$index++;
}
# Alternatively, build it with a hash slice
# @inner_hash{ @headings } = @data ;
# Create an entry in the outer hash by
# taking a reference to our inner hash
$outer_hash{ $label } = \%inner_hash;
}
say Dumper( \%outer_hash );
exit 0;
答案 1 :(得分:0)
我想出了一个使用unpack的解决方案。它处理不同文件中的任意数量的列。
#!/usr/bin/perl
use strict;
use warnings;
my %data;
my ($len, @hdrs);
my (@lengths, $template);
while (<DATA>) {
next if 1 .. /^Mapped points:/;
s/\s+$//; # safe 'chomp' (if there are spaces at end of string)
if (s/^(Mapped points\s+)//) {
$len = length $1;
@hdrs = split;
# lengths of all but the last (Total) field
@lengths = map {length} /(\S+\s+)/g;
# add 'A*' at the end to capture the 'Total' field
$template = join("", "A$len", map {'A'. $_} @lengths) . "A*";
}
else {
my ($key, @rest) = unpack $template;
my %temp;
@temp{ @hdrs } = @rest;
$data{$key} = \%temp;
}
}
use Data::Dumper; print Dumper \%data;
__DATA__
Mapped points: SYSTEM class
Mapped points PI PO DFF DLAT Z BBOX Total
Golden 86 43 139145 276 4 833 140387
Revised 86 43 139145 276 4 833 140387
输出是:
$VAR1 = {
'Golden' => {
'Total' => '140387',
'BBOX' => '833',
'Z' => '4',
'PO' => '43',
'DLAT' => '276',
'DFF' => '139145',
'PI' => '86'
},
'Revised' => {
'Total' => '140387',
'BBOX' => '833',
'Z' => '4',
'PO' => '43',
'DLAT' => '276',
'DFF' => '139145',
'PI' => '86'
}
};
答案 2 :(得分:-1)
我想过来,找到了解决方案!!我知道它还不是最好的解决方案,我请你为我提供最好的解决方案
cat > test.pl
#!/usr/bin/perl
use strict;
use warnings;
while(<DATA>){
my $line = $_;
if ($line =~ m/^Mapped points: SYSTEM class$/){
next unless $line = <DATA> and $line =~ /^Mapped points/;
my @arr = split(' ',$line);
next unless $line = <DATA>;
my @golden_arr = split(' ',$line);
my $size1 = @golden_arr;
for(my $i = 1;$i < $size1 ;$i++){
my $var = "Golden_".$arr[$i+1];
print "$var, $golden_arr[$i]\n";
}
next unless $line = <DATA>;
my @revised_arr = split(' ',$line);
my $size2 = @revised_arr;
for(my $i = 1;$i < $size2 ;$i++){
my $var = "Revised_".$arr[$i+1];
print "$var, $revised_arr[$i]\n";
}
}
}
__DATA__
Mapped points: SYSTEM class
Mapped points PI PO DFF DLAT Z BBOX Total
Golden 86 43 139145 276 4 833 140387
Revised 86 43 139145 276 4 833 140387
perl test.pl
Golden_PI, 86
Golden_PO, 43
Golden_DFF, 139145
Golden_DLAT, 276
Golden_Z, 4
Golden_BBOX, 833
Golden_Total, 140387
Revised_PI, 86
Revised_PO, 43
Revised_DFF, 139145
Revised_DLAT, 276
Revised_Z, 4
Revised_BBOX, 833
Revised_Total, 140387
有更好的解决方案吗?