我有2个制表符分隔的文件,如下所示。
第一档: -
raj krishna 2345 19041884
dev sri 1573 13894083
dev ravi 1232 54445434
第二档: -
dev sri 1573 42334334
kar ham 3214 45354354
我想删除第一个文件中与第二个文件中前3个字段匹配的所有行。因此,删除后的输出第一个文件应如下所示。
raj krishna 2345 19041884
dev ravi 1232 54445434
任何人都可以告诉我如何在perl或shell脚本中实现这一点。
由于
答案 0 :(得分:1)
这就是:
$ awk 'NR == FNR{a[$3];next} !($3 in a)' file2 file1
raj krishna 2345 19041884
dev ravi 1232 54445434
首先保存file2的第3个字段。然后打印没有第3个字段的行或文件。
答案 1 :(得分:1)
Perl解决方案。我打包它作为测试,所以你可以......测试它。
#!/usr/bin/perl
use strict;
use warnings;
use autodie qw( open);
use Test::More tests => 1;
# I initialize the data within the test
# the real code would skip this, and open the real files instead
my $file1="raj krishna 2345 19041884
dev sri 1573 13894083
dev ravi 1232 54445434
";
my $file2="dev sri 1573 42334334
kar ham 3214 45354354
";
my $expected="raj krishna 2345 19041884
dev ravi 1232 54445434
";
my $file_out;
open( my $in1, '<', \$file1); # read from a string
open( my $in2, '<', \$file2);
open( my $out, '>', \$file_out); # write to a string
# below is the real code
# load the list of "records" to remove
# for each line take the first 3 fields (anything except a tab followed by a tab, 3 times)
my %to_remove= map { line_to_key( $_) => 1 } <$in2>;
while( my $line=<$in1>)
{ print {$out} $line unless $to_remove{line_to_key( $line)}; }
close $out;
# test whether we got what we wanted
is( $file_out, $expected, 'basic test');
# the "key": split on tab, then join the first 3 fields, again tab separated
sub line_to_key
{ my( $line)= @_;
my @fields= split /\t/, $line;
my $key= join "\t", @fields[0..2];
return $key;
}