我有基于每分钟数据的时间,并希望将其汇总为每小时(或其他时段,如周,月)。
数据看起来像这样
timeStamp,kwH,watts
"2016-07-16 16:18:51",0.014,710
"2016-07-16 16:20:01",0.013,669
"2016-07-16 16:22:40",0.020,720
...
"2016-07-16 21:06:01",0.006,360
"2016-07-16 21:07:00",0.006,366
"2016-07-16 21:08:01",0.007,413
"2016-07-16 21:09:01",0.006,360
我想要按第1列的小时分组第二列(kwH)。
http://pastebin.com/raw/BbjLebVx
提供了更大的数据集我如何总结这个?我猜这可能涉及awk。
其次,鉴于生成图表的数据,Web服务和bash脚本都驻留在我控制的服务器上,我是否更有效地在mySQL中对这些数据求和,而不是试图让gnuplot处理兆字节的原始数据数据?
答案 0 :(得分:0)
$ cat > test.awk
{
gsub(/^.* |:.*/,"",$1); # using regex remove all but the hour from the timestamp for "grouping by the hour"
arr[$1]+=$2 # sum together the "kwH"
}
END { # after summing we print
for (i in arr) # for each element (hour) in the array
print i,arr[i]} # print the element and the sum of "kwH"
$ awk -f test.awk test.in
timeStamp 0
21 0.025
... 0
16 0.047