我正在尝试编写一个Perl / AWK脚本来比较两个文件'输出格式如下所示:
(截至目前,我可以diff
使用两个文件
grep -Fxvf file1 file2 > file3
这还不够。)
注意:file1是file2的超集。
文件1:
aaaa
bbbb
cccc
dddd
file2的:
bbbb
cccc
预期的输出文件:
aaaa No
bbbb yes
cccc yes
dddd No
答案 0 :(得分:2)
在perl
use strict;
use warnings;
open ( my $file_2,"<", "file2.txt" ) or die $!;
my %seen;
while ( my $line = <$file_2> ) {
chomp ( $line ) ;
$seen{$line}++;
}
close ( $file_2 );
open ( my $file_1, "<", "file1.txt" ) or die $!;
while ( my $line1 = <$file_1> ) {
chomp $line1;
print $line1, " ", $seen{$line1} ? "yes" : "no", "\n";
}
close ( $file_1 );
打印:
aaaa no
bbbb yes
cccc yes
dddd no
您可能希望应用正则表达式来清除空格,例如$line =~ s/^\s+//g;
但我不确定行开头的空格是格式化,填充还是实际重要,所以我没有触摸它
答案 1 :(得分:0)
使用awk:
awk 'NR == FNR { a[$0]; next } { print $0, ($0 in a ? "yes" : "no") }' file2 file1
那是:
NR == FNR { # while processing the first file
a[$0] # (i.e., file2) just remember what you
next # saw, and don't do anything else
}
{ # afterwards:
print $0, ($0 in a ? "yes" : "no") # print the line followed by "yes" or
# "no" depending on whether the line
# was seen before in file2
}