解析文本文件并将定界字段存储在哈希表(Perl)中

时间:2018-07-20 21:17:07

标签: perl file parsing hash hashmap

我是Perl的新手,如果我在这里缺少一些非常简单的内容,我们深表歉意。

我的文件格式如下:

TGName: name1             
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3   
 setting4                                          value4
 setting5                                          value5

 ...
 ...

TGName: name47             
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3

 ...
----------------------------------------------------------------------------
SGName: name1             
----------------------------------------------------------------------------
 ...

需要与类似的文本文件(格式为乱序)进行比较。我的想法是我可以将文本文件的每个“块”存储在一个散列中,因此上面的内容类似于:

my %TGName:name1= (
    setting1 => 'value1',
    setting2 => 'value2',
    setting3 => 'value3',
    setting4 => 'value4',
);

等等,然后我可以将两个文件中具有相同名称的每个哈希相互比较。

我现在面临的问题是将以TGName,SGName等开头的每一行读入哈希,并将设置和值作为键/值对。

This问题的编辑最接近我在搜索时发现的内容,但是很遗憾,在原始问题被编辑后没有人回答。

任何见识将不胜感激!

编辑:这是一些类似(且更简单)项目的一些代码,其中每一行都是唯一的而不是分成几组。在这里,输出列出了两个文件共有的行,仅在第一个文件中找到的行和仅在第二个文件中找到的行:

use strict;
use warnings;
use List::Compare;

# create log.txt for writing
my $log = 'log.txt';

# create $f1 string and read in file1
open (my $f1, "<", "file1.txt") or die $!;

# create $f2 string and read in file2
open (my $f2, "<", "file2.txt") or die $!;

# initialize array and populate with the contents of $f1
my @content_f1=<$f1>;

# initialize array and populate with the contents of $f2
my @content_f2=<$f2>;

# create comparison string
my $lc = List::Compare->new(\@content_f1, \@content_f2);    

# initialize array showing commonalities of file 1 and file 2  
# and populate with the contents of get_intersection() 
my @intersection = $lc->get_intersection;

# initialize array showing elements unique to new config  
# and populate with the contents of get_unique() 
my @firstonly = $lc->get_unique;

# initialize array showing elements unique to golden config
# and populate with the contents of get_complement() 
my @secondonly = $lc->get_complement;

# create $out string to write contents into log
open(my $out, '>', $log) or die "Cannot open file '$log' for writing: $!";

# write the contents of the intersection and unique arrays to log.txt
print $out "Common Items:\n"."@intersection"."\n";
print $out "Items Only in file 1 \n"."@firstonly"."\n";
print $out "Items Only in file 2:\n"."@secondonly"."\n";

close $out;
close $f1;
close $f2;

理想情况下,我希望在此处获得相同类型的输出,除了将%file1_hash_name1与%file2_hash_name1比较(而不是将文本文件与文本文件进行比较)(例如:两个哈希共有的项目,仅在第一个哈希中找到的项目,仅在第二个哈希中找到。

1 个答案:

答案 0 :(得分:0)

在此示例中,我正在使用Test::Deep中的 eq_deeply 函数来测试两个文件之间的相等性。

如果您还要查找一个文件中的内容,而不是另一个文件中的内容,则可以使用List :: Compare函数。

#!/usr/bin/perl
use strict;
use warnings;
use Test::Deep::NoTest 'eq_deeply';
use List::Compare;

open my $fh, '<', \<<EOF;
TGName: name1
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3   
 setting4                                          value4
 setting5                                          value5


TGName: name47
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3

----------------------------------------------------------------------------
SGName: name1
EOF

my %data1;
my $key1;

while (<$fh>) {
    if (/^([A-Z]{2}Name: name\d+)/) {
        $key1 = $1;
    }
    elsif (/^[-\s]+$/) {
        next;
    }
    else {
        my ($setting, $val) = split;    
        $data1{$key1}{$setting} = $val;
    }
}

open my $fh2, '<', \<<EOF;
TGName: name1
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3   
 setting4                                          value4
 setting55                                          value5


TGName: name47
----------------------------------------------------------------------------
 setting1                                          value1 
 setting2                                          value2
 setting3                                          value3

----------------------------------------------------------------------------
SGName: name11
EOF

my %data2;
my $key2;

while (<$fh2>) {
    if (/^([A-Z]{2}Name: name\d+)/) {
        $key2 = $1;
    }
    elsif (/^(?:-+|\s+)$/) {
        next;
    }
    else {
        my ($setting, $val) = split;    
        $data2{$key2}{$setting} = $val;
    }
}

my $lc = List::Compare->new([keys %data1], [keys %data2]);
my @intersection = $lc->get_intersection;

for my $key (@intersection) {
    if (eq_deeply($data1{$key}, $data2{$key})) {
        print "key $key has the same hash for both files\n";    
    }
    else {
        print "key $key has different hashes\n";    
    }
}

输出为:

key TGName: name1 has different hashes
key TGName: name47 has the same hash for both files

我自己没有使用 eq_deeply 函数,但是我相信它将为您提供所需的结果。