计算不同列的时差-Bash

时间:2018-12-17 08:19:50

标签: bash shell awk sed

我看到了很多答案,但没有一个解决我的问题。 这是我的罚款

cat run_time
Done  City        Start_time              End_time  
Yes   Chicago     10:16:51,14-Dec-2018   10:19:38,14-Dec-2018        
Yes   Atlanta     10:12:58,14-Dec-2018   10:20:58,14-Dec-2018               
No    Minnetonka  10:16:38,14-Dec-2018   10:21:50,14-Dec-2018        
Yes   Hopkins     10:22:20,14-Dec-2018   10:18:11,14-Dec-2018

当我可以手动计算时,一切正常。

TO=$(date -d "10:16:58 14-Dec-2018" +%s)
TAL=$(date -d "10:16:50 14-Dec-2018" +%s)
TOTAL=$(( "$TO" - "$TAL" ))
echo $TOTAL
8

但是,每当我尝试将其集成到awk函数中时都会出现错误。

首先,我删除了时间和日期之间的逗号。

sed -i -e 's/,/ /g' run_time
Done  City        Start_time              End_time  
Yes   Chicago     10:16:51 14-Dec-2018   10:19:38 14-Dec-2018                
Yes   Atlanta     10:12:58 14-Dec-2018   10:20:58 14-Dec-2018               
No    Minnetonka  10:16:38 14-Dec-2018   10:21:50 14-Dec-2018        
Yes   Hopkins     10:22:20 14-Dec-2018   10:18:11 14-Dec-2018

运行以下awk命令将显示以下信息:

awk 'BEGIN { OFS = "\t" } NR == 1 { $7 = "Time_diff" } NR >= 2 { $7 = "$3,$4" - "$5,$6" } 1' < run_time|column -t
Done  City        Start_time              End_time               Time_diff
Yes   Chicago     10:16:51 14-Dec-2018  10:19:38 14-Dec-2018  
Yes   Atlanta     10:12:58 14-Dec-2018  10:20:58 14-Dec-2018  
No    Minnetonka  10:16:38 14-Dec-2018  10:21:50 14-Dec-2018  
Yes   Hopkins     10:22:20 14-Dec-2018  10:18:11 14-Dec-2018  

我的目标是计算时间差并在Time_diff下添加

6 个答案:

答案 0 :(得分:2)

$ cat input.txt 
Done  City        Start_time              End_time  
Yes   Chicago     10:16:51,14-Dec-2018   10:19:38,14-Dec-2018        
Yes   Atlanta     10:12:58,14-Dec-2018   10:20:58,14-Dec-2018               
No    Minnetonka  10:16:38,14-Dec-2018   10:21:50,14-Dec-2018        
Yes   Hopkins     10:22:20,14-Dec-2018   10:18:11,14-Dec-2018



$ cat diff_time.awk
BEGIN{
    print "Done City Start_time End_time Time_diff"
}
{
    if(!/^Do/){
        diff_time=0
        start_full=substr($3,1,8)" "substr($3,10,11)
        end_full=substr($4,1,8)" "substr($4,10,11)
        "date -d "q start_full q" +%s"|getline start_epoc
        "date -d "q end_full q" +%s"|getline end_epoc
        diff_time= end_epoc - start_epoc
        if(diff_time<0){
            diff_time=diff_time*-1
        }
        "date -d@"diff_time" -u +%H:%M:%S"|getline diff_h
        print $0,diff_h
        }
}



$ awk -v q='"' -f diff_time.awk input.txt |column -t
Done  City        Start_time            End_time              Time_diff
Yes   Chicago     10:16:51,14-Dec-2018  10:19:38,14-Dec-2018  00:02:47
Yes   Atlanta     10:12:58,14-Dec-2018  10:20:58,14-Dec-2018  00:08:00
No    Minnetonka  10:16:38,14-Dec-2018  10:21:50,14-Dec-2018  00:05:12
Yes   Hopkins     10:22:20,14-Dec-2018  10:18:11,14-Dec-2018  00:04:09

答案 1 :(得分:1)

一个小脚本可以做到

#!/bin/bash
(
TOTAL=0
while read -r line
do
  if [ "`echo $line|grep ^Done`" != "" ]
  then
    echo "$line"
  else
    TO=$(date -d "`echo $line|tr -s " "|cut -d " " -f 3|tr "," " "`" +%s)
    TAL=$(date -d "`echo $line|tr -s " "|cut -d " " -f 4|tr "," " "`" +%s)
    SUBTOTAL=$(( $TO - $TAL ))
    echo "$line $SUBTOTAL"
    TOTAL=$(( $TOTAL + $SUBTOTAL ))
  fi
done
echo $TOTAL
) <run_time

说明:脚本将run_time的每一行读入变量line。以Done开头的行将被简单地打印出来(表格的第一行)。对于所有其他行,请消除双精度空格(tr -s " "),提取第三个(cut -d " " -f 3)或第四个(cat -d " " -f 4)字段,然后将,替换为{ {1}},并使用您提供的相同公式来计算开始/结束日期和差异。最后,在行旁边打印差异。同时,您将所有差异的总和存储在中,并最终打印出来。

答案 2 :(得分:1)

使用gawk(不是没有时间功能的posix)

自我解释代码

awk '
   function convert2time ( ArgStrHr ) {
      # mktime format used "YYYY MM DD HH MM SS [DST]"
      # time format provided "10:16:51,14-Dec-2018"
      # extract element in a array
      T=split( ArgStrHr, aElt, /[-: ,]/ )

      # return the conversion
      return mktime( sprintf( "%4d %2d %2d %2d %2d %2d", aElt[6], aMonth[ aElt[5] ], aElt[4], aElt[1], aElt[2], aElt[3] ) )
      }

   BEGIN {
      # For string month convertion used in convert function
      split( "Jan Fev Mar Apr May Jun Jul Aug Sep Oct Nov Dec", aTemp )
       # revert a[i]="month" in a["month"]=i
      for ( Idx in aTemp ) aMonth[ aTemp[ Idx] ] = Idx
      }

   FNR==1 { $(NF + 1) = "Difference" }

   FNR!=1 {
      # take time in coutable form
      T1 = convert2time( $3 )
      T2 = convert2time( $4 )
      # add a field with difference
      $(NF + 1) =  T2 - T1
      }

   # print lines
   1    
   ' YourFile

答案 3 :(得分:1)

考虑到您的Input_file是:

cat Input_file
Done  City        Start_time              End_time
Yes   Chicago     10:16:51,14-Dec-2018   10:19:38,14-Dec-2018
Yes   Atlanta     10:12:58,14-Dec-2018   10:20:58,14-Dec-2018
No    Minnetonka  10:16:38,14-Dec-2018   10:21:50,14-Dec-2018
Yes   Hopkins     10:22:20,14-Dec-2018   10:18:11,14-Dec-2018

脚本遵循以下规则:

  1. 此代码应注意哪个DATE COLUMN具有比其他值更大的值,它将以这种方式处理差异。例如。最后一列的时间大于最后一列的时间,则它将执行last_col_time-second_last_col_time,反之亦然。

  2. 我将14-Dec-2018月份的所有小写字母都更改了,所以即使它们是任何形式(小写,大写或混合),我们都应该擅长。

  3. 我没有在代码中对第3列和第4列的值进行硬编码,因为第2列(城市)之间的城市名称之间可能有空格,因此我从诸如$(NF-1)之类的最后一个字段中获取了列值(倒数第二列)和$NF(最后一列值)。

所有操作均通过以下方式完成:

awk '
BEGIN{
  num=split("jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec",month,",")
  for(i=1;i<=12;i++){
    a[month[i]]=i
  }
}
FNR==1{
  print $0,"Time_diff"
  next
}
{
  split($(NF-1),array,"[:,-]")
  split($(NF),array1,"[:,-]")
  val=mktime(array[6]" "a[tolower(array[5])]" "array[4]" "array[1]" "array[2]" "array[3])
  val1=mktime(array1[6]" "a[tolower(array1[5])]" "array1[4]" "array1[1]" "array1[2]" "array1[3])
  delta=val>=val1?val-val1:val1-val
  hrs = int(delta/3600)
  min = int((delta - hrs*3600)/60)
  sec = delta - (hrs*3600 + min*60)
  printf "%s\t%02d:%02d:%02d\n", $0, hrs, min, sec
  hrs=min=sec=delta=""
}
'  Input_file  | column -t

输出如下。

Done  City        Start_time            End_time              Time_diff
Yes   Chicago     10:16:51,14-Dec-2018  10:19:38,14-Dec-2018  00:02:47
Yes   Atlanta     10:12:58,14-Dec-2018  10:20:58,14-Dec-2018  00:08:00
No    Minnetonka  10:16:38,14-Dec-2018  10:21:50,14-Dec-2018  00:05:12
Yes   Hopkins     10:22:20,14-Dec-2018  10:18:11,14-Dec-2018  00:04:09

上面的代码解释: 抱歉,我们需要在此处右侧滚动。

awk '                                                                                               ##Starting awk code here.
BEGIN{                                                                                              ##Mentioning BEGIN section of awk here.
  num=split("jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec",month,",")                            ##Creating month array which have months value in it.
  for(i=1;i<=12;i++){                                                                               ##Starting a for loop for covering 12 months.
    a[month[i]]=i                                                                                   ##Creating an array a whose index is month value and value is i.
  }                                                                                                 ##Closing for loop block here.
}                                                                                                   ##Closing BEGIN section block here.
FNR==1{                                                                                             ##Checking if this is first line.
  print $0,"Time_diff"                                                                              ##Printing current line with string Time_diff here.
  next                                                                                              ##next will skip all further statements from here.
}                                                                                                   ##Closing FNR conditoin block here.
{
  split($(NF-1),array,"[:,-]")                                                                      ##Splitting 2nd last column to array named array.
  split($(NF),array1,"[:,-]")                                                                       ##Splitting last column to array with delimietr as : or , or -
  val=mktime(array[6]" "a[tolower(array[5])]" "array[4]" "array[1]" "array[2]" "array[3])           ##Creating val which have mktime value,passing array elements.
  val1=mktime(array1[6]" "a[tolower(array1[5])]" "array1[4]" "array1[1]" "array1[2]" "array1[3])    ##Creating val1 variable by mktime passing array1 elements.
  delta=val>=val1?val-val1:val1-val                                                                 ##getting diff of val and val1 depending upon highest-lowest value
  hrs = int(delta/3600)                                                                             ##getting diff in hours if any.
  min = int((delta - hrs*3600)/60)                                                                  ##getting diff in min if any.
  sec = delta - (hrs*3600 + min*60)                                                                 ##getting diff in seconds value.
  printf "%s\t%02d:%02d:%02d\n", $0, hrs, min, sec                                                  ##Printing line and value of hrs,min and sec values here.
  hrs=min=sec=delta=""                                                                              ##Nullifying variables values here.
}
'  Input_file | column -t                                                              ##Mentioning Input_file and passing it to column command for TAB format in output.

答案 4 :(得分:1)

idk表示结束时间早于开始时间的含义,但这对时间函数使用GNU awk表示,它在输出中的时间差上带有前导“-”:

$ cat tst.awk
BEGIN { OFS="\t" }
NR==1 {
    print $0, "Time_diff"
    next
}
{
    for (i=NF-1; i<=NF; i++) {
        split($i,t,/[:,-]/)
        t[5] = (index("JanFebMarAprMayJunJulAugSepOctNovDec",t[5])+2)/3
        secs[i] = mktime(t[6]" "t[5]" "t[4]" "t[1]" "t[2]" "t[3])
    }

    sign = " "
    totSecsDiff = secs[NF] - secs[NF-1]
    if (totSecsDiff < 0) {
        sign = "-"
        totSecsDiff = 0 - totSecsDiff
    }

    hrsDiff  = int(totSecsDiff / (60*60))
    minsDiff = int((totSecsDiff - (hrsDiff*60*60)) / 60)
    secsDiff = totSecsDiff - (hrsDiff*60*60 + minsDiff*60)
    hmsDiff  = sprintf("%s%02d:%02d:%02d",sign,hrsDiff,minsDiff,secsDiff)

    print $0, hmsDiff
}

$ awk -f tst.awk file
Done  City        Start_time              End_time      Time_diff
Yes   Chicago     10:16:51,14-Dec-2018   10:19:38,14-Dec-2018    00:02:47
Yes   Atlanta     10:12:58,14-Dec-2018   10:20:58,14-Dec-2018    00:08:00
No    Minnetonka  10:16:38,14-Dec-2018   10:21:50,14-Dec-2018    00:05:12
Yes   Hopkins     10:22:20,14-Dec-2018   10:18:11,14-Dec-2018   -00:04:09

答案 5 :(得分:1)

使用Perl核心模块

> cat kwa_time.in
Done  City        Start_time              End_time
Yes   Chicago     10:16:51,14-Dec-2018   10:19:38,14-Dec-2018
Yes   Atlanta     10:12:58,14-Dec-2018   10:20:58,14-Dec-2018
No    Minnetonka  10:16:38,14-Dec-2018   10:21:50,14-Dec-2018
Yes   Hopkins     10:22:20,14-Dec-2018   10:18:11,14-Dec-2018
> cat ./time_diff.sh
perl  -lane '
BEGIN {
use POSIX;
use Time::Local;
@months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
}
print "$_\tTime_diff" if $.==1;
if($.>1)
{
$dte="$F[-1]";
$dts="$F[-2]";
$dte=~s/(\d+):(\d+):(\d+),(\d+)-(\S+)-(\d+)/timelocal($3,$2,$1,$4,(grep { $5 eq $months[$_] } 0..$#months)[0],$6-1900)/ge;
$dts=~s/(\d+):(\d+):(\d+),(\d+)-(\S+)-(\d+)/timelocal($3,$2,$1,$4,(grep { $5 eq $months[$_] } 0..$#months)[0],$6-1900)/ge;
$diff = abs($dte-$dts);
$hd=int $diff/3600;
$md=int (($diff-($hd*3600))/60);
$sd=int ($diff - ($hd*3600+$md*60));
printf("%s %02d:%02d:%02d\n",join(" ",@F),$hd,$md,$sd);
}
' $1
> ./time_diff.sh kwa_time.in | column -t
Done  City        Start_time            End_time              Time_diff
Yes   Chicago     10:16:51,14-Dec-2018  10:19:38,14-Dec-2018  00:02:47
Yes   Atlanta     10:12:58,14-Dec-2018  10:20:58,14-Dec-2018  00:08:00
No    Minnetonka  10:16:38,14-Dec-2018  10:21:50,14-Dec-2018  00:05:12
Yes   Hopkins     10:22:20,14-Dec-2018  10:18:11,14-Dec-2018  00:04:09
>