Question

我有一个包含行组的文本文件，我只需要每组中的前三行。

文件：

test1|pass
test1|pass
test1|pass
test1|pass
test1|pass
test2|fail
test2|fail
test2|fail
test2|fail
test3|pass
test3|pass
test3|pass
test3|pass

预期产出：

test1|pass
test1|pass
test1|pass
test2|fail
test2|fail
test2|fail
test3|pass
test3|pass
test3|pass

到目前为止我尝试过：

BEGIN {
        FS = "|"
}
        $1==x {
        if (NR % 5 <= 3) {
                print $0
        }
        next
}
{
        x=$1
        print $0
}

END {
        printf "\n"
}

Answer 1

你可以这样简洁地做到这一点：

awk -F'|' '++a[$1] <= 3' infile

输出：

test1|pass
test1|pass
test1|pass
test2|fail
test2|fail
test2|fail
test3|pass
test3|pass
test3|pass

说明的

a是一个关联数组。我们使用每一行的第一个元素（$1）作为a的键，并增加其值。然后将此值与3进行比较，如果比较为真，则执行默认块（{print $0}）。

Answer 2

BEGIN {
        FS = "|"
}
        $1==x && count <= 3 {
        print;
        count++;
        }
        next
}
{
        x=$1;
        print;
        count=1;
}

Answer 3

使用awk的其他方式

awk '{a[$1]+=1}END{ for (b in a) {for(i=1; i<=3; i++) print b} }'  temp.txt | sort

Answer 4

如果您的数据按照问题中显示的升序排列，则可以使用此perl代码。

#!/usr/perl/bin -w

use strict;
use Data::Dumper;

my $file_name = "file.txt";
my $new_file = "new_file.txt";
open(FH, "<".$file) or die "Could not open $file";
open (NFH, ">$new_file") or die "Could not open $new_file";

my @content = <FH>;

my $old_line = "";
my $count = 0;
foreach my $line (@content) {

    if( ($old_line ne $line) || ($count < 3) ) {
        print NFH $line;
    }

    print NFH "$first $second $third";
}

close NFH;
close FH;

或

如果您的数据不符合规定，则可以使用此Perl代码：

#!/usr/perl/bin -w use strict; use Data::Dumper; my $file_name = "file.txt"; my $new_file = "new_file.txt"; open(FH, "<".$file) or die "Could not open $file"; open (NFH, ">$new_file") or die "Could not open $new_file"; my @content = <FH>; my %hash = map($_ => 1) @content; my $count = 0; foreach my $key (keys(%hash)) { while($count < 3) { print NFH $key; $count++; } } close NFH; close FH;

awk：从一组线中打印三行

4 个答案: