在bash中进行统计

时间:2017-10-09 08:17:18

标签: bash awk

我有一个包含大约1000行的文件,非常像这样:

0,23423423,7ds5dsfdf,2008-08-03,19:00:01,101,hJ890
1,54645645,f9g8f9gd7,2008-08-03,19:00:20,113,Lg78s
1,54645645,f9g8f9gd7,2008-08-03,19:00:09,108,Lg78s
0,54645645,f9g8f9gd7,2008-08-03,19:00:01,130,dsf98
1,54645645,f9g8f9gd7,2008-08-03,19:00:20,105,Lg78s

时间后的列表示秒数。如何根据文件中每个日期的秒数进行统计,从最小的一个到最大的? 例如,我应该得到类似的东西:

The date Sun Aug  3 19:00:01 EEST 2008 has 231 seconds
The date Sun Aug  3 19:00:09 EEST 2008 has 108 seconds
The date Sun Aug  3 19:00:20 EEST 2008 has 218 seconds

我试过这样的事情:

while read line
do
    date=awk -F "," '{print $4","$5}'
    var=grep "$date"
done

找到特定日期的实例后,如何选择与之对应的秒数?

谢谢!

2 个答案:

答案 0 :(得分:4)

您可以使用此awk

awk -F, '{cmd="date -d \"" $4 " " $5 "\""; cmd | getline dt; close(cmd); a[dt] += $6}
END{for (i in a) print i " has " a[i] " seconds"}' file

Sun Aug  3 19:00:09 EDT 2008 has 108 seconds
Sun Aug  3 19:00:20 EDT 2008 has 218 seconds
Sun Aug  3 19:00:01 EDT 2008 has 231 seconds

这个awk命令 - 使用逗号作为输入字段分隔符。 - 构造日期字符串使用第4列和第5列。 - 使用关键数组,其中键作为日期和值,作为秒值的总和

参考: Effective AWK Programming

如果您希望对日期进行排序,请使用awk + sort + cut作为此日期:

awk -F, '{s=$4 " " $5; cmd="date -d \"" s "\""; cmd | getline dt; close(cmd);
a[dt] += $6; b[dt]=s} END{for (i in a) print b[i] ";" i " has " a[i] " seconds"}' file |
sort -t ';' -k1,2 |
cut -d ';' -f2-

Sun Aug  3 19:00:01 EDT 2008 has 231 seconds
Sun Aug  3 19:00:09 EDT 2008 has 108 seconds
Sun Aug  3 19:00:20 EDT 2008 has 218 seconds

答案 1 :(得分:2)

请您尝试按照awk命令告诉我这是否对您有所帮助。将很快添加非单一衬里形式。

awk -F, '{s=$4 " " $5; gsub(/[:-]/, " ", s); t=mktime(s); dt=strftime("%c", t); a[t]=dt; b[t]+=$6} END{for(i in a) print a[i] " has " b[i] " seconds"}'  Input_file

感谢Anubhava纠正我的代码。