如何比较两个文本文件并从每个文件输出任何丢失或额外的字符串?

时间:2013-11-29 07:54:57

标签: perl compare

我有这两个示例文件来比较它们的内容。我需要比较这两个文件并从任何文件中输出任何缺失或额外的字符串。

Ref.txt:

bjkdsl
dookn
cmshdksldj

New.txt:

cmshdksldj
unklskdjs
dookn

输出:

unklskdjs :missing string in Ref.txt    
bjkdsl :missing string in New.txt

更新:示例文本文件1

Ref.txt:

bjkdsl
dookn
cmshdksldj

New.txt:

cmshdksldj
unklskdjs
dookn
bjkdsl

输出:

unklskdjs : missing string in new.txt

示例文件2:

Ref.txt:

cmshdksldj
unklskdjs
dookn
bjkdsl

New.txt:

cmshdksldj
unklskdjs
dookn
bjkdsl

输出:

Ref.txt is same as New.txt

感谢所有帮助,但我仍然试图获取可能发生的每种情况的代码。

2 个答案:

答案 0 :(得分:2)

在第二个文件中找到密钥时,必须从哈希中删除密钥。最后遍历哈希并打印您未删除的所有键:

#!/usr/bin/env perl

use warnings;
use strict;

my %exclude;

open my $fh, '<', 'text2.txt' or die $!;
while (<$fh>) {
        chomp;
        $exclude{$_}++;
}

open $fh, '<', 'text1.txt' or die $!;
while (<$fh>) {
        chomp;
        if ( exists $exclude{ $_ } ) {
                delete $exclude{ $_ };
        }
        else {
                print "$_ is missing from text2\n";
        }
}

for ( keys %exclude ) {
        print "$_ is missing from text1\n";
}

像以下一样运行:

perl script.pl

产量:

bjkdsl is missing from text2
unklskdjs is missing from text1

答案 1 :(得分:2)

use strict;
use warnings;

open my $fh, '<', 'text1.txt' or die $!;
chomp(my @arr1 = <$fh>);

open my $fh2, '<', 'text2.txt' or die $!;
chomp(my @arr2 = <$fh2>);

my (%m1, %m2);
# populate %m1 hash with keys from @arr1 array using hash slice
@m1{@arr1} = ();
# ..
@m2{@arr2} = ();

# remove from %m1 hash keys which are found in @arr2,
# leaving thus only these which are unique to @arr1 array
delete @m1{@arr2};
# ..
delete @m2{@arr1};

# print only keys found in %m2 which by now are only these not found in @arr1
# this could be just print..for keys %m2; but order of element would be lost
print "$_ is missing from text 1\n" for grep { exists $m2{$_} } @arr2;
# ..
print "$_ is missing from text 2\n" for grep { exists $m1{$_} } @arr1;

输出

unklskdjs is missing from text 1
bjkdsl is missing from text 2