以下是我的输入和输出.txt
文件。
我希望按StatusDate
和Method
对数据进行分组。
然后根据StatusDate
和Method
汇总值。
INPUT.TXT
No,Date,MethodStatus,Key,StatusDate,Hit,CallType,Method,LastMethodType
112,12/15/16,Suceess,Geo,12/15/16,1,Static,GET,12/15/16
113,12/18/16,Suceess,Geo,12/18/16,1,Static,GET,12/18/16
114,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
115,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
116,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16
117,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16
118,12/19/16,Waiting,Geo,12/19/16,1,Static,GET,12/19/16
119,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
120,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16
121,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16
130,12/16/16,Suceess,Geo,12/16/16,1,Static,GET,12/16/16
Out.txt
StatusDate,12/15/16,12/16/16,12/17/16,12/17/16,12/18/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,Grand Total
GET,1,1,1,1,1,1,1,1,1,,,9
PUT,,,,,,,,,,1,1,2
Grand Total,1,1,1,1,1,1,1,1,1,1,1,11
我使用awk
并按awk -F, '{if($8=="GET") print }'
分割数据,然后计算总和值。
由于文件很大,所以会有延迟。
是否有可能一步到位?那么文件操作会减少吗?
答案 0 :(得分:0)
您可以使用这样的GNU awk脚本:
<强> script.awk 强>
BEGIN { PROCINFO["sorted_in"] = "@ind_str_asc" }
function remember( theDate, mem) {
mem[ theDate] +=1
# in Totals the column sum is stored for each possible date (i.e the columns)
Totals[theDate] += 1
}
# with header 0 or 1 the first line in output is differentiated
# OFS is used, so it is possible to use a commandline option like
# -v OFS='\t' or -v OFS=','
function printMem( mem, name, header ) {
printf("%s%s",name,OFS)
sum=0
for( k in Totals ) {
if( header)
printf("%s%s", k, OFS )
else {
printf("%s%s", mem[k], OFS )
sum += mem[k]
}
}
if(!header)
printf("%s", sum )
else
printf("Grand Total")
print ""
}
# different methods are stored in different arrays
$8 == "GET" { remember( $2, get ) }
$8 == "PUT" { remember( $2, put ) }
END { # print the stored values
# the first line header
printMem( Totals , "StatusDate", 1)
printMem( get , "GET", 0)
printMem( put , "PUT", 0)
# the summary line
printMem( Totals , "Grand Total", 0)
}
运行如下脚本:awk -F, -v OFS=',' script.awk Input.txt