如何使用awk和sed或perl脚本格式化文件以获取所需的文件

时间:2013-03-13 22:20:34

标签: perl shell unix sed awk

我有一个如下文件:

101 start_time
102 start_time
101 end_time
103 start_time
103 end_time
102 end_time
104 start_time
104 end_time
102 start_time
102 end_time

我想要一个如下所示的输出文件:

101 start_time end_time
102 start_time end_time
103 start_time end_time
104 start_time end_time
102 start_time end_time

使用基本的sed或awk操作或使用perl如何实现?请帮忙!

3 个答案:

答案 0 :(得分:2)

怎么样:

awk '$1 in a{ print $1, a[$1], $2; delete a[$1]; next} {a[$1] = $2}' input

答案 1 :(得分:0)

perl -anE'say "@F end_time" if $F[1] eq "start_time"'

答案 2 :(得分:0)

遵循Perl方法。

  • 注1:写得不是很好,但是它运作正常
  • 注2:我的回答是基于以下考虑:“start_time”和“end_time”你不是字面上的字符串,而是某种时间戳或其他什么

你去了:

#!/usr/bin/perl
use warnings;
use strict;

my @waiting; #here we will keep track of the order
my %previous; #here we will save previous rows that still can't be printed
open (my $IN,'<','file.txt') or die "$!"; #assuming that your data is in file.txt
while (<$IN>) {
    chomp;
    my ($id,$time)=split/ /;
    if (exists $previous{$id}) { #if this is the end_time
        $previous{$id}->[1]=$time;
        if ($waiting[0]==$id) { #if we are not waiting for another row's end_time
            my $count=0;
            for (@waiting) { #print anything you have available
                last if !defined $previous{$_}->[1];
                print join(' ',$x,@{$previous{$_}}),"\n";
                delete $previous{$_};
                $count++;
            }
            shift @waiting for 1..$count; 
        }
    }
    else { #if this is the start_time
        push @waiting,$id;
        $previous{$id}=[$time,undef];
    }
}
close $IN;