Perl / AWK计数字段并分隔到csv

时间:2014-04-24 19:23:36

标签: perl parsing awk

我有一个包含以下内容的文本文件:

24-04-2014 14:14:47  100-10    clear        "TSP:hfe-tus-02.RtpEvtMgr01: "
24-04-2014 14:15:00  226-8008  information  "APPL:hfe-tus-02.HLR_AFW_SS7_"
24-04-2014 14:15:00  226-9008  information  "APPL:hfe-tus-02.HLR_AFW_SS7_"
24-04-2014 14:15:00  103-88    information  "TSP:hfe-tus-02.RtpRecMgr01: "
24-04-2014 14:15:10  236-434   clear        "APPL:hfe-tus-02.IMS_DIAMETER"
24-04-2014 14:15:10  236-461   clear        "APPL:hfe-tus-02.IMS_DIAMETER"
24-04-2014 14:15:10  236-461   clear        "APPL:hfe-tus-02.IMS_DIAMETER"
24-04-2014 14:15:11  236-435   major        "APPL:hfe-tus-02.IMS_DIAMETER"
24-04-2014 14:15:11  236-464   information  "APPL:hfe-tus-02.IMS_DIAMETER"
24-04-2014 14:15:15  103-91    information  "TSP:hfe-tus-02.RtpRecMgr01: "

挑战在于计算Colum 3中的独特性(例如100-10)。然后我们也必须随着时间推移它(假设间隔5分钟)。时间在第2列,日期在第1列。这样,我们可以获得每个代码随时间变化的唯一输出和进度,间隔为5分钟。示例输出可以是这样的。

Date,100-10, 226-8008,226-9008,236-434
24-04-2014 14:00:00,2,5,10,13
24-04-2014 14:05:00,6,4,8,10
24-04-2014 14:10:00,1,8,6,9
24-04-2014 14:15:00,3,4,7,8

对不起,我几乎失去了任何事情。 PS。第3列中可以有许多唯一的代码,但为了简单起见我简单介绍了几个。

===

答案

我的代码是这样的,它也有效。所以过了几天才想到分享它。

cut -f4 -d " " RtpFile | sort -u
awk '$0>=from&&$0<=to' from="2014/03/20 15:13" to="2014/08/19 14:31" infile

my $fields = `cut -c 28-38 /dump/TspTrace/RtpTrcError/RtpTrcError.0090 | sort -u`; // cut columns to get codes
my @arr = split / /, $fields;
my $files1 = ls -lrt /dump/TspTrace/RtpTrcError/ | grep "Apr 24" | cut -c 55-70
my @files = split / /, $files1;

> /tmp/Output.txt
foreach (@files) {
    `cat /dump/TspTrace/$_ >> /tmp/Output.txt`;
}

1 个答案:

答案 0 :(得分:0)

您可以尝试以下perl脚本:

#! /usr/bin/perl

use v5.14;

use Time::Piece;

my $fmt="%d-%m-%Y %T";
my $startTime = Time::Piece->strptime( "24-04-2014 14:00:00", $fmt);
my $inc=5*60;

my @lines=<>;

my ($ids,$hids)=getIds(\@lines);

my $endTime=getEndTime(\@lines,$startTime, $fmt);

my $dates=getDates($startTime,$inc,$endTime,$fmt,$#$ids+1);

doCount(\@lines,$dates,$startTime,$inc, $fmt,$hids);

print "Dates,", join(",",@$ids),"\n";

for my $date (@$dates) {
    print $date->{name},",";
    my $info=$date->{ids};
    print join(",",@$info),"\n";
}


sub doCount { 
  my ($lines,$dates,$startTime,$inc, $fmt,$h) = @_;

  for (@$lines) {
      my @fld=split(" ");
      my $id=$fld[2];
      my $d=join(" ",@fld[0..1]);
      my $t = Time::Piece->strptime( $d, $fmt);
      my $s=$t-$startTime;
      my $ind=int($s/$inc);
      my $k=$h->{$id};
      $dates->[$ind]->{ids}->[$k]+=1;
  }
}

sub getDates { 
  my ($startTime,$inc,$endTime, $fmt,$len) = @_;
  my $t=0; my $time=$startTime;
  my @d;
  while ($t<=$endTime) {
      push (@d,{name=> $time->strftime($fmt), ids => [(0) x $len]});
      $time=$time+$inc;
      $t=$t+$inc;
  }
  return \@d;
}

sub getEndTime { 
  my ($lines,$startTime, $fmt) = @_;

  my $max=0;
  for (@$lines) {
      my $d=join(" ",@{[split(" ")]}[0..1]);
      my $t = Time::Piece->strptime( $d, $fmt);
      my $s=$t-$startTime;
      if ($s>$max) {
          $max=$s;
      }
  }
  return $max;
}

sub getIds { 
  my ($lines) = @_;

  my %h;
  for (@$lines) {
      my $id=@{[split(" ")]}[2];
      $h{$id}=1;
  }
  my @ids=sort keys %h;
  my %hids= map { $ids[$_] => $_ } 0..$#ids;
  return (\@ids,\%hids);
}

从命令行运行它./p.pl file,其中file是您的文本文件。