我们有正常和表格形式的数据文本文件。我可以读取正常数据,但我无法读取表格形式的数据。
任何人都可以帮我解读并提取表格数据。
文字档案数据:
225 Top Hitters
RT(ms) BRT(ms) TL(ms) l_mig_a l_mig_w b_mig_a b_mig_w l_b_mig_a l_b_mig_w b_l_mig_a b_l_mig_w
-------- --------- -------- --------- --------- --------- --------- ----------- ----------- ----------- -----------
11078.9 141.3 3754.8 418 7325 0 0 0 4 0 4
Total active inter-cluster migrations: 0
Total wakeup inter-cluster migrations: 8
Total active migrations: 418
Total wakeup migrations: 7333
我的代码:
use strict;
use warnings;
my ($RT,$BRT,$TL ,$l_mig_a,$l_mig_w,$b_mig_a,$b_mig_w,$l_b_mig_a,$l_b_mig_w,$b_l_mig_a,$b_l_mig_w);
open (FH, "<" ,"file.txt") or print "could not open $!";
my @lines = <FH>;
close FH;
foreach my $line (@lines) {
print "$line \n";
}
预期输出:
$RT = 11078.9
$BRT = 141.3
$TL = 3754.8
$l_mig_a = 418
$l_mig_w = 7325
$b_mig_a = 0
$b_mig_w = 0
$l_b_mig_a = 0
$l_b_mig_w = 4
$b_l_mig_a = 0
$b_l_mig_w = 4
答案 0 :(得分:0)
在您的预期输出中,您在每个标题名称前加上$
。我希望您的意图不是eval
结果并以编程方式使用这些值,因为有更好的方法可以做到这一点(例如,哈希)。如果这是您的计划,那么您在行尾也会丢失分号。
由于我无法从您的问题中推断出您的用例,因此我决定按原样转储密钥和值;随意添加你喜欢的任何装饰。
use strict;
use warnings;
my @keys;
my @values;
while (<DATA>) {
if ($. == 2) {
@keys = split;
for (@keys) {
s/\W.+$//;
}
} elsif ($. == 4) {
@values = split;
last;
}
}
for my $i (0 .. $#keys) {
print "$keys[$i] = $values[$i]\n";
}
__DATA__
225 Top Hitters
RT(ms) BRT(ms) TL(ms) l_mig_a l_mig_w b_mig_a b_mig_w l_b_mig_a l_b_mig_w b_l_mig_a b_l_mig_w
-------- --------- -------- --------- --------- --------- --------- ----------- ----------- ----------- -----------
11078.9 141.3 3754.8 418 7325 0 0 0 4 0 4
Total active inter-cluster migrations: 0
Total wakeup inter-cluster migrations: 8
Total active migrations: 418
Total wakeup migrations: 7333
如果您的输入文件实际上只有10行(即,您还没有告诉我们额外的500万个数据行),您可以简化阅读并将其拆分为几行代码:
my @lines = <DATA>;
my @keys = map { s/\W.+$//r } split(' ', $lines[1]);
my @values = split(' ', $lines[3]);
输出:
RT = 11078.9
BRT = 141.3
TL = 3754.8
l_mig_a = 418
l_mig_w = 7325
b_mig_a = 0
b_mig_w = 0
l_b_mig_a = 0
l_b_mig_w = 4
b_l_mig_a = 0
b_l_mig_w = 4
要收集值以供以后在程序中使用,同时保持标头和值之间的关联,请创建一个哈希:
my %hash;
@hash{@keys} = @values;
哈希将具有以下结构:
{
b_l_mig_a => 0,
b_l_mig_w => 4,
b_mig_a => 0,
b_mig_w => 0,
BRT => 141.3,
l_b_mig_a => 0,
l_b_mig_w => 4,
l_mig_a => 418,
l_mig_w => 7325,
RT => 11078.9,
TL => 3754.8,
}
答案 1 :(得分:0)
这是Matt的替代策略,它搜索文件中包含一个或多个连字符-
,可能的空格,而不是其他内容的第一行。然后列标签位于上一行,值列于下一行
use strict;
use warnings 'all';
use List::Util 'max';
use constant DATA_FILE => 'tabular_data.txt';
# Read the whole file into an array
my @file = do {
open my $fh, '<', DATA_FILE or die $!;
<$fh>;
};
chomp @file;
# Find the first line that contains only one or more hyphens
# and possibly some whitespace
my $i = 0;
for ( @file ) {
last if /\-/ and not /[^-\s]/;
++$i;
}
die "Header line not found" unless $i < @file;
# Build the key array from the preceding line, and the
# values array from the succeeding line
my @keys = split ' ', $file[$i-1];
s/\(.*// for @keys;
my @values = split ' ', $file[$i+1];
my %data;
@data{@keys} = @values;
# Display what we've recovered
my $w = max map length, @keys;
for my $key ( @keys ) {
printf "%-*s => %s\n", $w, $key, $data{$key};
}
RT => 11078.9
BRT => 141.3
TL => 3754.8
l_mig_a => 418
l_mig_w => 7325
b_mig_a => 0
b_mig_w => 0
l_b_mig_a => 0
l_b_mig_w => 4
b_l_mig_a => 0
b_l_mig_w => 4
答案 2 :(得分:-2)
你可以&#34;啜饮&#34;将整个文件转换为单个字符串变量,并使用正则表达式来解析表格数据。请在下面找到带有子程序的示例脚本,以简化生成正则表达式。
下面请查看将代码捆绑在一起的测试数据的示例实现到单个脚本/文件中。
use strict;
use warnings;
my $text;
{
# put all lines into single string
local $/ = undef;
$text = <DATA>;
}
my $regex = &make_regex(qw{RT(ms) BRT(ms) TL(ms) l_mig_a l_mig_w b_mig_a b_mig_w l_b_mig_a l_b_mig_w b_l_mig_a b_l_mig_w});
print "REGEX-START\n$regex\nREGEX-END\n"; # Debuging: Show generated regular expression
my ($RT,$BRT,$TL ,$l_mig_a,$l_mig_w,$b_mig_a,$b_mig_w,$l_b_mig_a,$l_b_mig_w,$b_l_mig_a,$b_l_mig_w)
= $text =~ /$regex/ or die;
print "b_l_mig_w = $b_l_mig_w\n";
sub make_regex {
my $n = scalar(@_);
my $str = '
\s*' . join('\s+',map {quotemeta($_)} @_) . '\s*
\s*' . join('\s+',('-+') x $n) . '\s*
\s*' . join('\s+',('(\S+)') x $n) . '\s*
';
qr{$str}m;
} # end sub make_regex
__DATA__
225 Top Hitters
RT(ms) BRT(ms) TL(ms) l_mig_a l_mig_w b_mig_a b_mig_w l_b_mig_a l_b_mig_w b_l_mig_a b_l_mig_w
-------- --------- -------- --------- --------- --------- --------- ----------- ----------- ----------- -----------
11078.9 141.3 3754.8 418 7325 0 0 0 4 0 4
Total active inter-cluster migrations: 0
Total wakeup inter-cluster migrations: 8
Total active migrations: 418
Total wakeup migrations: 733